'Check/Resolve cross-references in separate xml files
Starting point
Let's say we have a book in xml format. This book consists of many assets and these can reference each other by a tag ref-asset with attribute path. [Path-Mask: {id}|{version} of target-asset].
Important: Assets are single files and there is no merged, complete file.
Exemplary XML (merged for better visual view)
<book>
<!-- file a.xml -->
<asset id="1" version="1.0">
<name>Prolog</name>
</asset>
<!-- file b.xml -->
<asset id="2" version="2">
<name>Table of content</name>
<list>
<item><ref-asset path="1|1.0">Prolog</ref-asset></item>
<item><ref-asset path="2|2.0">Table of content</ref-asset></item>
<item><ref-asset path="3|1.1">FooBar</ref-asset></item>
</list>
</asset>
<!-- file c.xml -->
<asset id="3" version="1.1">
<name>FooBar</name>
</asset>
</book>
Request
- Check all
ref-assetif linked target is inbook. - Create report about results [exists, not exists, asset exists but wrong version, ...]
- [in addition: Replace the reference with the content of target.]
Settings
- Saxon 9.6.x EE XSLT 2.0
- Java
- 100 up to x thousand single documents (combined filesize: upper 3 digit Mb)
How to solve
First attempt function collection() + function document():
Search for all single asset-files on filesystem via collection(), load them into process via document() and search for matching hits.
Second attempt Merged, complete File:
Merge all single assets into book and match via xsl:key or similiar techniques.
Question(s)
- Is
collection()capable of loading thousands of documents and still perform well with a followeddocument()to process the asset? - How to "index" run-timed loaded documents [still via
xsl:key?] to search efficiently?
Further hints are highly appreciated / No specific stylsheet needed [i will do it on my own, as soon as i know what way to go].
EDITs: collection() returns already a sequence of document nodes, so document() might be unnecessary.
Solution 1:[1]
I have written an npm package to resolve references in xml. Hope it serves your purpose https://www.npmjs.com/package/xml-path-resolver. This package would take the xml and return JSON with resolved paths
CODE USAGE
const xmlPathResolver = require("xml-path-resolver");
const xmlString = `
<?xml version="1.0" encoding="utf-8"?>
<note id="1212" importance="high" logged="true" x_note="23">
<title>Happy</title>
<todo>Work</todo>
<todo>Play</todo>
</note>
<note id="23" importance="high" logged="true">
</note>
<note importance="high" logged="true">
</note>
<person x_note="1212">
</person>
`;
const resolvedJSON = xmlPathResolver(xmlString,{ crossReference: /x_(.*)/ });
Example :
<?xml version="1.0" encoding="utf-8"?>
<note id="1212" importance="high" logged="true" x_note="23">
<title>Happy</title>
<todo>Work</todo>
<todo>Play</todo>
</note>
<note id="23" importance="high" logged="true">
</note>
<note importance="high" logged="true">
</note>
<person x_note="1212">
</person>
The above xml has cross reference paths, The resolved JSON output is
{
"_declaration": {
"_attributes": {
"version": "1.0",
"encoding": "utf-8"
}
},
"note": [
{
"_attributes": {
"id": "1212",
"importance": "high",
"logged": "true",
"x_note": {
"_attributes": {
"id": "23",
"importance": "high",
"logged": "true"
}
}
},
"title": {
"_text": "Happy"
},
"todo": [
{
"_text": "Work"
},
{
"_text": "Play"
}
]
},
{
"_attributes": {
"id": "23",
"importance": "high",
"logged": "true"
}
},
{
"_attributes": {
"importance": "high",
"logged": "true"
}
}
],
"person": {
"_attributes": {
"x_note": {
"_attributes": {
"id": "1212",
"importance": "high",
"logged": "true",
"x_note": {
"_attributes": {
"id": "23",
"importance": "high",
"logged": "true"
}
}
},
"title": {
"_text": "Happy"
},
"todo": [
{
"_text": "Work"
},
{
"_text": "Play"
}
]
}
}
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
