'Python BS4 append extraSoup after a TEXT
I have a file and I need to modify it with Python. there is a text inside of the file which is not included in any tag. it is like this:
===TEXT===
I need to append data after this text. I have a python script with BS4:
for index, row in df.iterrows():
with open("template.html", 'r') as inf:
html = inf.read()
soup = bs4.BeautifulSoup(html, features="html.parser")
extraSoup = bs4.BeautifulSoup("""
{{transclude name="link" }}
""", "html.parser")
for container in soup.find(text="===TEXT==="):
container.append(extraSoup)
prettyHTML = soup.prettify()
with open("template.html", 'w') as outf:
outf.write(str(prettyHTML))
I am trying to find the text and append the extraSoup. But I get error
AttributeError: 'NoneType' object has no attribute
Is it even possible to do this with BS4? cause I think BS4 only work with tags? is there any way to achieve this?
Solution 1:[1]
I think BS4 only work with tags?
It's not clear where you got this idea. BS4, like any spec-compliant HTML parser, will parse otherwise unbound text into a text node. Use find_all()
and check its contents for your templating constant (===TEST===
) before using insert_after
to insert the content you're after into the DOM.
Test HTML:
<span id="test1">
test
</span>
===TEXT===
<span id="test2">
test
</span>
Updated Python script:
import bs4
with open("template.html", 'r') as inf:
html = inf.read()
soup = bs4.BeautifulSoup(html, features="html.parser")
for container in soup.findAll(text=True):
print(container.text)
if container.text.strip() == "===TEXT===":
container.insert_after('{{transclude name="link" }}')
prettyHTML = soup.prettify()
with open("template2.html", 'w') as outf:
outf.write(str(prettyHTML))
Output:
<span id="test1">
test
</span>
===TEXT===
{{transclude name="link" }}
<span id="test2">
test
</span>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | esqew |