'Find IP address string in multiple text files and add it to the relevant rows in Python
I have multiple txt files with below details:
Text File1:
Id = 0005
Cause = ERROR
Code = 307
Event Time = 2020-11-09 10:16:48
Severity = WARNING
Severity Code = 5
Id = 0006
Cause = FAILURE
Code = 517
Event Time = 2020-11-09 10:19:47
Severity = MINOR
Severity Code = 4 ip[10.1.1.1
Text File2:
Id = 0007
Cause = ERROR
Code = 307
Event Time = 2020-11-09 10:16:48
Severity = WARNING
Severity Code = 5
Id = 0008
Cause = FAILURE
Code = 517
Event Time = 2020-11-09 10:19:47
Severity = MINOR
Severity Code = 4
ip[10.1.1.3
I want to see below result if it is possible:
Id Cause Code Event Time Severity Severity Code ip
0005 ERROR 307 2020-11-09 10:16:48 WARNING 5 10.1.1.1
0006 FAILURE 517 2020-11-09 10:19:47 MINOR 4 10.1.1.1
0007 ERROR 307 2020-11-09 10:16:48 WARNING 5 10.1.1.3
0008 FAILURE 517 2020-11-09 10:19:47 MINOR 4 10.1.1.3
Besides now at the moment I have the below result and I don't know how it would be possible to add IP as the other column.
Id Cause Code Event Time Severity Severity Code
0005 ERROR 307 2020-11-09 10:16:48 WARNING 5
0006 FAILURE 517 2020-11-09 10:19:47 MINOR 4
0007 ERROR 307 2020-11-09 10:16:48 WARNING 5
0008 FAILURE 517 2020-11-09 10:19:47 MINOR 4
Code:
import re
pattern = re.compile("(.+?)=(.+?)\s{2,}")
data = []
item = {}
with open("data.txt") as fp:
for line in fp:
for m in pattern.finditer(line):
key, value = [m.group(i).strip() for i in [1,2]]
if key == "Id":
if item:
data.append(item)
item = {"Id": value}
else:
item[key] = value
data.append(item)
df = pd.DataFrame(data)
Solution 1:[1]
The thing that is unclear to me is how would you assign the same IP for both IDs (005 & 006)? basically, you have some missing data for some of the events. Other than this, something as simple as the bellow added to your for loop should do the job.
with open("data.txt") as fp:
for line in fp:
for m in pattern.finditer(line):
key, value = [m.group(i).strip() for i in [1,2]]
if key == "Id":
if item:
data.append(item)
item = {"Id": value}
else:
item[key] = value
if 'ip[' in line:
item['ip'] = line.split('ip[')[1]
data.append(item)
df = pd.DataFrame(data)
>>> df
Id Cause Code Event Time Severity Severity Code ip
0005 ERROR 307 2020-11-09 10:16:48 WARNING 5 NaN
0006 FAILURE 517 2020-11-09 10:19:47 MINOR 4 10.1.1.1
0007 ERROR 307 2020-11-09 10:16:48 WARNING 5 NaN
0008 FAILURE 517 2020-11-09 10:19:47 MINOR 4 10.1.1.3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
