'Parsing XML with iterative function. How to return all results to calling function?

I'm trying to read out all ContentControl values from Word DOCX files. The packages I'm aware of don't support filtering ContentControls from Word files (w:sdt nodes in the XML). I can filter ContentControls myself and print the results to standard output:

def iterate_tree (element):
    my_text = ''
    my_tag = ''
    for child in element:
        child_text, child_tag = iterate_tree(child)
        my_text = my_text + child_text
        my_tag = my_tag + child_tag
    if not element.text is None:
        my_text = my_text + element.text
    if '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt' == element.tag:
        if  my_tag != '':
            print(my_text, my_tag)
    if 'tag' in element.tag:
        my_tag = element.attrib['{http://schemas.openxmlformats.org/wordprocessingml/2006/main}val']
    return my_text, my_tag

This may print something like:

2022-04-19 15:49:14 StartTimeStamp
Welch Allyn Solna 001 - 9023778 Testutrustning
Welch Allyn Solna 002 - 2620-005 Testutrustning
Welch Allyn Solna 003 - 2620-006 Testutrustning
916-052 InvNr
Click or tap here to enter text. InvNr
Click or tap here to enter text. InvNr
Click or tap here to enter text. InvNr
Click or tap here to enter text. InvNr
Click or tap here to enter text. InvNr
Click or tap here to enter text. InvNr
2022-04-19 Utförddatum
John Doe(xy12) Username

Now I'd really like to return the result to the calling function, so that I can work with the result. I'd like to build a function "for each ContentControl with tag = InvNr, do ...", or "get value of ContentControl with tag == {something}".

I've tried to put yield instead of the print statement, but that produces errors in this iterative function...

I can't think of a good way to get the result out of the iterations, probably because I've not seen enough algorithms/practice/experience... I'd be happy to get some search-words or thoughts that help me find the solution, but right now I'm desperate :) ----- Edit: Hi Paul, thanks for your guidance. Unfortunately, it's an iterative function that calls itself for every new level of XML that it dives into. That doesn't seem to play well with yield.

At the moment I've modified the code to hand a list up and down through the iterations, but it doesn't feel elegant...

def iterate_tree (element,liste):
    my_text = ''
    my_tag = ''
    for child in element:
        child_text, child_tag, liste = iterate_tree(child, liste)
        my_text = my_text + child_text
        my_tag = my_tag + child_tag
    if not element.text is None:
        my_text = my_text + element.text
    if '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}sdt' == element.tag:
        if  my_tag != '':
            liste.append([my_tag,my_text])
    if 'tag' in element.tag:
        my_tag = element.attrib['{http://schemas.openxmlformats.org/wordprocessingml/2006/main}val']
    return my_text, my_tag, liste

python

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Parsing XML with iterative function. How to return all results to calling function?

Sources

Related Questions