'How to identify common values in two collections in MarkLogic
I have two XML documents like below and they are in two collections
<Person>
<Name>A</Name>
<Age>23</Age>
</Person>
another XML is like
<Employee>
<EName>A</Ename>
<ECompany>abc</ECompany>
</Employee>
above is the sample xmls and these two xmls are in two different collections named person and employee. I want to identify the element Name from documents in the person collection which matches to EName element from documents in the employee collection.
Solution 1:[1]
One way to do this would be to search for the values from the person collection, and use them as criteria to search the employee collection:
let $person-names := cts:search(/Person/Name, cts:collection-query("person"))
return
cts:search(/Employee/EName,
cts:and-query((
cts:collection-query("employee"),
cts:element-value-query(xs:QName("EName"), $person-names)
))
)/text()
However, that may not work with the scale and size of your docs.
You could use CoRB and break it into 2 phases. Run one job that returns all of the Name values from the docs in the person collection, and write it all out to a file. Then use that file as the inputs for a second job that issues a search against the employee collection for EName values, and returns the value that is written out into the final report. Same concept as above, just broken out into lots of little queries.
If you had a range-index on the Name and Ename elements, then you could use cts:element-values(), return them as a map, and find the intersection of values:
let $person-names := cts:element-values(xs:QName("Name"), "", "map", cts:collection-query("person"))
let $employee-names := cts:element-values(xs:QName("EName"), "", "map", cts:collection-query("employee"))
return $person-names * $employee-names
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
