'how to identify and print a pattern inside an ascii file in python 2?
I am trying to develop a program that can read patterns from a txt file using Python 2.x. This pattern is supposed to be a bug:
| |
###O
| |
And the pattern doesn't include the whitespaces.
So far I have come up with a way to open the txt file, read it and process the data inside of it but I can't think of a way to make Python understand this pattern as 1, instead of counting each character. I've tried regular expressions but it ended up showing an output similar to this:
| |
###O
| |
| |
###O
| |
| |
###O
| |
Instead of just saying how many of this pattern were detected inside the file, for example:
There were 3 occurrences.
Update: So far i got this
file = open('bug.txt', 'r')
data = file.read() #read content from file to a string
occurrences = data.count('| |\n\'###O\'\n| |\n')
print('Number of occurrences of the pattern:', occurrences)
But this is not working. The file itself has the patterns 3 times but with whitespaces in between, but the whitespace is not part of the pattern and when i try to paste the pattern from the file it breaks the lines, and if i correct the pattern to | | ###O | | it shows 0 occurrences because its not really the pattern.
Solution 1:[1]
It depends on how you store your ASCII data, but if you convert it to a string you can use the python .count() function.
For example:
# define string
ascii_string = "| | ###O | | | | ###O | | | | ###O | |"
pattern = "| | ###O | |"
count = ascii_string.count(pattern)
# print count in python 3+
print("Occurrences:", count)
# print count in python 2.7
print "Occurrences:", count
This will result in:
Occurrences: 3
Solution 2:[2]
>>> import re
>>> data = '''| |
... ###O
... | |
... | |
... ###O
... | |
... | |
... ###O
... | |'''
>>> result = re.findall('[ ]*\| \|\n[ ]*###O\n[ ]*\| \|', data)
>>> len(result)
3
>>>
Result being occurrences.
How to do it from a file:
import re
with open('some file.txt') as fd:
data = fd.read()
result = re.findall('[ ]*\| \|\n[ ]*###O\n[ ]*\| \|', data)
len(result)
Alternative way of doing it to accommodate for edit on OP:
>>> data = '''| |
... ###O
... | |
... | |
... ###O
... | |
... | |
... ###O
... | |
... | | ###O | |'''
>>> data.replace('\n', '').replace(' ', '').count('||###O||')
4
>>>
Solution 3:[3]
I solved the problem this way.
def somefuntion(file_name):
ascii_str = ''
with open(file_name, 'r') as reader:
for line in reader.readlines():
for character in line.replace('\n', '').replace(' ', ''):
ascii_str += str(ord(character))
return ascii_str
if __name__ == "__main__":
bug = somefuntion('bug.txt')
landscape = somefuntion('landscape.txt')
print(landscape.count(bug))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 |
