'How can I remove all comments on a large xml file using python?

I am working on a small project for my office. The purpose of this script is take the inputs (siteid), and open/recall information from that file. Once the file has been opened/recalled, the script will then do the following:

  1. Remove ALL Comments (< !--), (-- >), and text in between as they are not necessary. There are many comment section similar to this, all wanting to be removed. As an example.
<!--
*# R500
*# Create RiLink
*# RiLink=6
*# Radio Number = 6
-->
  1. Make changes to the XML file depending on the input provided (sectorid).
<Equipment xmlns="cim:ReqEquipment">
   <RiLink xmlns="cim:ReqRiLink">
        <riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=1,RiPort=F</riPortRef1>
        <riPortRef2>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=RRU-12,RiPort=DATA_1</riPortRef2>
   </RiLink>        
</Equipment>

<ENodeBFunction>
        <SectorCarrier xc:operation="create">
            <sectorCarrierId>12</sectorCarrierId>   
        </SectorCarrier>        
</ENodeBFunction>
<ENodeBFunction>
        <EUtranCellFDD xc:operation="create">
                <sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=12</sectorCarrierRef>
        </EUtranCellFDD>    
</ENodeBFunction>

Using the above XML data as an example. I am first asking for the input given by the user to ensure the proper file gets opened up. If the input for siteid contains 'BA2' it will expect the sectorids to either be 41,51,61, else expect 11,21,31. I did a nest if statement to validate the inputs.

Now the other changes I wish to make to the xml are a bit heavy. If the user puts '61' for the sector ID, I want the script to change the following areas:

BEFORE

<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=1,RiPort=F</riPortRef1>
<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=1,RiPort=J</riPortRef1>
<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=1,RiPort=L</riPortRef1>

<sectorCarrierId>6</sectorCarrierId>
<sectorCarrierId>12</sectorCarrierId>
<sectorCarrierId>18</sectorCarrierId>
<sectorCarrierId>24</sectorCarrierId>
<sectorCarrierId>30</sectorCarrierId>

<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=6</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=12</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=18</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=24</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=30</sectorCarrierRef>

AFTER

<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=1,RiPort=F</riPortRef1>
<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=RRU-6,RiPort=DATA_2</riPortRef1>
<riPortRef1>ManagedElement=LNYBA2,Equipment=1,FieldReplaceableUnit=RRU-12,RiPort=DATA_2</riPortRef1>

<sectorCarrierId>6</sectorCarrierId>
<sectorCarrierId>16</sectorCarrierId>
<sectorCarrierId>56</sectorCarrierId>
<sectorCarrierId>156</sectorCarrierId>
<sectorCarrierId>76</sectorCarrierId>

<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=6</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=16</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=56</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=156</sectorCarrierRef>
<sectorCarrierRef>ManagedElement=LNYBA2,ENodeBFunction=1,SectorCarrier=76</sectorCarrierRef>

Now for the script I wrote up so far. Please help if you have an idea what I can do.

from lxml import etree

#Ask for inputs
siteid = input("Please enter the site ID: ")
sectorid = input("Please enter the sector: ")

#opens the file using the site ID
n04file = open("N04_" + str(siteid) + "_RadioScript.xml", "w")

#parsing for comment sections
tree = etree.fromstring(n04file)
comments = tree.xpath('//comment()')

for c in comments:
    p = c.getparent()
    p.remove(c)

#print results
#print(etree.tostring(tree))

def main() :

    #CREATES/OPENS FILE
    #outputlog = open('output.txt', 'a')
    #w for WRITE
    #a for APPEND
    #r for READ

    #SETTING PARAMETERS
    secondbb = "BA2"
    firstbbflag = 0
    secondbbflag = 0
    firstalphasector = "11"
    firstbetasector = "21"
    firstgammasector = "31"
    secondalphasector = "41"
    secondbetasector = "51"
    secondgammasector = "61"
    firstbbsector = [firstalphasector, firstbetasector, firstgammasector]
    secondbbsector = [secondalphasector,secondbetasector,secondgammasector]
    rilink1 = "1,RiPort=A"
    rilink2 = "1,RiPort=B"
    rilink3 = "1,RiPort=C"
    rilink4 = "1,RiPort=D"
    rilink5 = "1,RiPort=E"
    rilink6 = "1,RiPort=F"
    rilink7 = "RRU-1,RiPort=DATA_2"
    rilink8 = "RRU-2,RiPort=DATA_2"
    rilink9 = "RRU-3,RiPort=DATA_2"
    rilink10 = "RRU-4,RiPort=DATA_2"
    rilink11 = "RRU-5,RiPort=DATA_2"
    rilink12 = "RRU-6,RiPort=DATA_2"
    rilink511 = "RRU-7,RiPort=DATA_2"
    rilink521 = "RRU-8,RiPort=DATA_2"
    rilink531 = "RRU-9,RiPort=DATA_2"
    rilink541 = "RRU-10,RiPort=DATA_2"
    rilink551 = "RRU-11,RiPort=DATA_2"
    rilink561 = "RRU-12,RiPort=DATA_2"
    firstalphacell = ["1", "11", "51", "151", "71"]
    firstbetacell = ["2", "12", "52", "152", "72"]
    firstgammacell = ["3", "13", "53", "153", "73"]
    secondalphacell = ["4", "14", "54", "154", "74"]
    secondbetacell = ["5", "15", "55", "155", "75"]
    secondgammacell = ["6", "16", "56", "156", "76"]

    #CONFIRM SITEID AND SECTORID INPUTS BELONG
    if secondbb in siteid:
        secondbbflag += 1
        if sectorid in secondbbsector:
            print("Sector " + sectorid + " is in the second baseband.\n")
        else:
            print("Sector " + sectorid + " is not a valid selection.\n")
    else:
        firstbbflag += 1
        if sectorid in firstbbsector:
            print("Sector " + sectorid + " is in the first baseband.\n")
        else:
            print("Sector " + sectorid + " is not a valid selection.\n")

    print("Flags" + "\n" + "FirstBB: " + str(firstbbflag) + "\n" + "SecondBB: " + str(secondbbflag))
n04file.close()
main()


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source