'Group Child Nodes and Merge Child Data

I’m trying to transform following XML using XSLT 3.0. Big Thanks to Martin Honnen for getting so far.

Can you please help me fix it please? Thank you.

Note: I can use XSLT 2.0 as well.

Input XML

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Row>
        <EmployeeID>88888</EmployeeID>
        <Remote/>
        <StartDate>2014-10-19</StartDate>
        <EndDate>2021-10-08</EndDate>
        <Amount>100</Amount>
        <BasePay>760</BasePay>
        <Bonus>340</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>250</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate></EndDate>
        <Amount></Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2018-10-24</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>350</Amount>
        <BasePay>1600</BasePay>
        <Bonus>170</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate/>
        <EndDate></EndDate>
        <Amount></Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate>2021-10-25</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>8400</Amount>
        <BasePay>1500</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate></EndDate>
        <Amount></Amount>
        <BasePay>9200</BasePay>
        <Bonus>460</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate/>
        <EndDate>2022-01-01</EndDate>
        <Amount>7300</Amount>
        <BasePay>9200</BasePay>
        <Bonus/>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2019-09-22</StartDate>
        <EndDate>2022-01-01</EndDate>
        <Amount>7300</Amount>
        <BasePay>9200</BasePay>
        <Bonus/>
    </Row>
</Root>

There are two types of records in this input XML

  1. element with a value set to Y or N
  2. element is empty

<EmployeeID> with same value can appear multiple times in the input file.

If element <Remote> with a value set to Y or N for an <EmployeeID>, it’s possible that same <EmployeeID> appears multiple times in input file.

Value of <Remote> will either be Y or N for one <EmployeeID>. However, element <Amount> can be different for one <EmployeeID> who appears multiple times on the record e.g. 23456

Result XML will need to be grouped to have single <Row> for one <EmployeeID> and merge the child data of same <EmployeeID> who has same value for element <Amount>

For e.g. 23456

Output XML should have two <EmployeeID> 23456 under <Row> because value of <Amount> is different even though <EmployeeID> is same. First occurrence of <EmployeeID> should have value of <Amount> 2400 and second occurrence of <EmployeeID> 23456 should have value of <Amount> 7300 – This part is partially working because result XML returns only one occurrence of <EmployeeID> 23456

For e.g. 12345

Output XML should have two <EmployeeID> 12345 under <Row> because value of <Amount> is different even though <EmployeeID> is same. First occurrence of <EmployeeID> 12345 should have value of 250 for <Amount> and second occurrence of <EmployeeID> 12345 should have value of 350 for – This part is already working with the attempted XSLT given below. No change is needed.

Expected Output

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Row>
        <EmployeeID>88888</EmployeeID>
        <Remote/>
        <StartDate>2014-10-19</StartDate>
        <EndDate>2021-10-08</EndDate>
        <Amount>100</Amount>
        <BasePay>760</BasePay>
        <Bonus>340</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>250</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2018-10-24</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>350</Amount>
        <BasePay>1600</BasePay>
        <Bonus>170</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate>2021-10-25</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>8400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2019-09-22</StartDate>
        <EndDate>2022-01-01</EndDate>
        <Amount>7300</Amount>
        <BasePay>9200</BasePay>
        <Bonus>460</Bonus>
    </Row>
</Root>

Current XSLT

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="3.0"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    expand-text="yes">
    
    <xsl:output method="xml" indent="yes"/>
    
    <xsl:mode on-no-match="shallow-copy"/>
    
    <xsl:key name="group" match="Root/Row" composite="yes" use="EmployeeID, Remote = ('Y', 'N')"/>
    
    <xsl:template match="Root/Row[. intersect key('group', (EmployeeID, true()))[position() > 1]]"/>
    
    <xsl:template match="Root/Row[. intersect key('group', (EmployeeID, true()))[1]]">
        <xsl:copy>
            <xsl:for-each-group select="key('group', (EmployeeID, true()))/*[node()]" group-by="node-name()">
                <xsl:apply-templates select="."/>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
    
</xsl:stylesheet>

Current Output - Second Occurrence of <EmployeeID>23456 is missing.

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Row>
        <EmployeeID>88888</EmployeeID>
        <Remote/>
        <StartDate>2014-10-19</StartDate>
        <EndDate>2021-10-08</EndDate>
        <Amount>100</Amount>
        <BasePay>760</BasePay>
        <Bonus>340</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>250</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>23456</EmployeeID>
        <Remote>N</Remote>
        <StartDate>2018-10-24</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>2400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
    <Row>
        <EmployeeID>12345</EmployeeID>
        <Remote/>
        <StartDate>2017-10-10</StartDate>
        <EndDate>2018-10-10</EndDate>
        <Amount>350</Amount>
        <BasePay>1600</BasePay>
        <Bonus>170</Bonus>
    </Row>
    <Row>
        <EmployeeID>65432</EmployeeID>
        <Remote>Y</Remote>
        <StartDate>2021-10-25</StartDate>
        <EndDate>2020-10-10</EndDate>
        <Amount>8400</Amount>
        <BasePay>1500</BasePay>
        <Bonus>150</Bonus>
    </Row>
</Root>


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source