'ignoring data using ttp module in python
I am going to explain the problem I faced with the following sample. I am able to parse the following data with the following config. When I used the {{ignore}} command, it helps me to get the line as the line matches the correct template, and ignore the data that I don't want to have.
from ttp import ttp
import json
data_to_parse = """
1.peace in the world
2.peace in the world world
3.peace in the world world world
"""
To parse this data I can use the following template.
ttp_template = """
<group name="Quote">
{{peace}} in the {{world}}
</group>
<group name="Quote">
{{peace}} in the {{world}} {{ignore}}
</group>
<group name="Quote">
{{peace}} in the {{world}} {{ignore}} {{ignore}}
</group>
"""
With the following config, I can have the parsed data as I wish:
def parser(data_to_parse):
parser = ttp(data=data_to_parse, template=ttp_template)
parser.parse()
# print result in JSON format
results = parser.result(format='json')[0]
#print(results)
#converting str to json.
result = json.loads(results)
print(result)
parser(data_to_parse)
See the output I have:
The problem is that I can not guess how many "world" at the of the each line, and I don't want to keep writing {{ignore}} commands to get the required line and avoid the word that I don't want to have. For example, if I add the following line in my data, it will not be catched with the template I shared above, I will need to add one more {{ignore}} to capture following data.
4.peace in the world world world world
What I have understood that the reason for this the ttp seperates the words from each space. For example, incase I have _ instead of 'space' as following 3.peace in the world_world_world I can get the data with a simple line in my template. However, in my data, I have lines with spaces that I need to be aware of and capture these lines as well.
So the question is that is there any way to facilitate this process? As you see that I have a workaround, however I need to find out a simple way to resolve the issue. Highly appreciate for any advise.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

