'How can I convert this data to .CSV from this deeply nested XML file?

I have a large dictionary XML file that is 105MB big. Below is a sample of how it's nested:

    <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lexdataset
  SYSTEM "CollexML.dtd">
 <superentry id="u583c10bfdbd326ba.31865a51.12110e76de1.-326"><entry publevel="2" id="u583c10bfdbd326ba.31865a51.12110e76de1.-325"><hwblk><hwgrp><hwunit><hw>aah</hw></hwunit></hwgrp></hwblk><datablk><gramcat publevel="2"><pospgrp><pospunit><posp value="verb"/></pospunit></pospgrp><sensecat publevel="2"><defgrp><defunit><def>exclaim in pleasure</def></defunit></defgrp></sensecat></gramcat></datablk></entry></superentry>
<superentry><entry publevel="2"><hwblk><hwgrp><hwunit form="inflected"><hw>aahed</hw></hwunit></hwgrp></hwblk><datablk><xrefgrp><xrefunit publevel="2"><xref superentryid="u583c10bfdbd326ba.31865a51.12110e76de1.-326" xrefid="u583c10bfdbd326ba.31865a51.12110e76de1.-325"><xrhw publevel="2">aah</xrhw></xref></xrefunit></xrefgrp></datablk></entry></superentry>
<superentry><entry publevel="2"><hwblk><hwgrp><hwunit form="inflected"><hw>aahing</hw></hwunit></hwgrp></hwblk><datablk><xrefgrp><xrefunit publevel="2"><xref superentryid="u583c10bfdbd326ba.31865a51.12110e76de1.-326" xrefid="u583c10bfdbd326ba.31865a51.12110e76de1.-325"><xrhw publevel="2">aah</xrhw></xref></xrefunit></xrefgrp></datablk></entry></superentry>
<superentry><entry publevel="2"><hwblk><hwgrp><hwunit form="inflected"><hw>aahs</hw></hwunit></hwgrp></hwblk><datablk><xrefgrp><xrefunit publevel="2"><xref superentryid="u583c10bfdbd326ba.31865a51.12110e76de1.-326" xrefid="u583c10bfdbd326ba.31865a51.12110e76de1.-325"><xrhw publevel="2">aah</xrhw></xref></xrefunit></xrefgrp></datablk></entry></superentry>
</lexdataset>

To me, it's very difficult to read and I'm unsure how to go about outputting it. Anyone have any ideas?

All I'd like to do is extract these things: The word itself The definition, Whether or not a word is inflected or a derivation



Solution 1:[1]

I would recommend you to try the CSV module of BaseX. For more details: http://docs.basex.org/wiki/CSV_Module

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jagrut