'Validate HUGE xml in browser side using JS
I need to validate a really huge XML (2GB) file in browser (In a PWA app using Web Workers). First the file must be validated with the XSD schema and then it must be transformed to JSON objects.
I just tried reading the file with the FileReader and slices, I can read the file, but I can't validate it.
I first tried thing like the following (it works with small files (KB to some MB)):
xmlDoc = parser.parseFromString(content, "text/xml");
if (xmlDoc.getElementsByTagName("parsererror").length > 0) {
isValid = false;
}
Then I use FileReader, this is working OK:
var r = new FileReader();
var blob = _file.slice(_offset, length + _offset);
r.onload = readEventHandler;
r.readAsText(blob);
But now I don't know how to validate the BIG xml using these slices.
Is there any lib or built-in JS function in order to achieve it (Using vanilla JS preferably)? Any other ideas?
Thanks in advance.
Solution 1:[1]
I don't think you can validate an XML which is sliced based on length, where it wont be logically ending or connected to next one.
You can try to validate the whole xml version using Web Workers, to see if it helps. This will create a separate thread from your main process. So it will not interrupt your user interaction or other regular operations.
I'm not sure if your use case. But relying on client side for such bulky operation is not always reliable.. especially when the client device is of low end hardware / outdated browser with low performance. Validating at server side would be the reliable solution, which understandably comes at added cost of bandwidth and server side load.
Solution 2:[2]
We had a very similar requirement in our project and there was no suitable solution, so we came up with xmlvalidate.js, a WASM compiled version of the schema validation functionality from libxml2.
This currently uses a string to communicate the XML to the web worker, but could easily be adapted to use a buffer instead (wasm uses a buffer internally anyway, this just means one less conversion) if you load the XML directly in the worker. Strings are currently limited to 1GiB in Chrome and 512MiB in Firefox.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Anand |
| Solution 2 |
