'Python BS4 append extraSoup after a TEXT

I have a file and I need to modify it with Python. there is a text inside of the file which is not included in any tag. it is like this:

===TEXT===

I need to append data after this text. I have a python script with BS4:

for index, row in df.iterrows():
        with open("template.html", 'r') as inf:
            html = inf.read()
            soup = bs4.BeautifulSoup(html, features="html.parser")

            extraSoup = bs4.BeautifulSoup("""
            {{transclude name="link" }}
            """, "html.parser")

            for container in soup.find(text="===TEXT==="):
                container.append(extraSoup)    
        prettyHTML = soup.prettify()  
        with open("template.html", 'w') as outf:
            outf.write(str(prettyHTML))

I am trying to find the text and append the extraSoup. But I get error

AttributeError: 'NoneType' object has no attribute

Is it even possible to do this with BS4? cause I think BS4 only work with tags? is there any way to achieve this?



Solution 1:[1]

I think BS4 only work with tags?

It's not clear where you got this idea. BS4, like any spec-compliant HTML parser, will parse otherwise unbound text into a text node. Use find_all() and check its contents for your templating constant (===TEST===) before using insert_after to insert the content you're after into the DOM.

Test HTML:

<span id="test1">
 test
</span>
===TEXT===
<span id="test2">
 test
</span>

Updated Python script:

import bs4
with open("template.html", 'r') as inf:
  html = inf.read()
  soup = bs4.BeautifulSoup(html, features="html.parser")

  for container in soup.findAll(text=True):
    print(container.text)
    if container.text.strip() == "===TEXT===":
      container.insert_after('{{transclude name="link" }}')
  prettyHTML = soup.prettify()  
with open("template2.html", 'w') as outf:
  outf.write(str(prettyHTML))

Output:

<span id="test1">
 test
</span>
===TEXT===
{{transclude name="link" }}
<span id="test2">
 test
</span>

Repl.it

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 esqew