'How to split octets of ip addressses from csv
I'm trying to take an IP log list (in CSV format) remove all duplicate addresses, then split it up into separate octets and store it in a list. I'm going to eventually add in checking for subnet duplicates (same ip in the same subnet), but I'm stuck here. there is extra information in the CSV file, but its been a long time since I've coded, so I'm mostly just focused on the IPs right now. I don't care if I lose the rest of the info.
I originally tried to try set character blocks (ex. take character 0-2 as the first octect, 4-6 as second, etc.) but that obviously doesn't work if octets are less than 3 digits. Then I tried using re.split to split up the octets, but it was saying that it wouldn't accept strings (for some reason). I then tried what I currently have in my code, but with blah=int(fin.append(lst[each].re.split(r".",int))), but that wouldn't work because of the periods in the intiger.
import pandas
import re
#set column names and input the data
colnames = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven']
data = pandas.read_csv(r'C:\Users\xxx\Desktop\Projects\Find the IPs\tiny version.csv', names=colnames)
#tell which column to read from then put in list called "lst"
ip = data.five.tolist()
lst = list(dict.fromkeys(ip))
#create final list
fin = []
#for every entry in lst, split it up at the periods, then add it to fin
for each in lst:
print("I got here")
blah = lst([each]).re.split(r".",int)
fin.append(blah)
right now, I'm hoping for a list with each entry containing all four octets as integers.
Solution 1:[1]
Old thread, but posting for anyone else who runs into this problem in the future. This is not the most elegant solution by far, but here is one that worked for me:
import pandas
import re
#start of OP's code
data = pandas.read_csv(r'INSERT_FILEPATH_HERE') #reads in your csv file, please put the file path for your CSV where it says INSERT_FILEPATH_HERE, keep quotes
ip_addresses = data["IP"].tolist() #converts the column containing the IP into a list please replace IP with your IP column name
ip_list = list(dict.fromkeys(ip_addresses)) #converts list into dict to remove duplicates, converts it back
#end of OP's code
#creating lists to be used for ip octet columns
octet_1 = list()
octet_2 = list()
octet_3 = list()
octet_4 = list()
#for loop to iterate through each IP in the column
for ip_entry in ip_list:
split_ip = [int(octet) for octet in ip_entry.split('.')] #solution shared by earlier comment, splits up IP into octets by splitting string at periods
octet_1.append(split_ip[0]) #appending octets into appropriate columns
octet_2.append(split_ip[1])
octet_3.append(split_ip[2])
octet_4.append(split_ip[3])
#this solves the problem of generating octet lists, If you wish to append these lists as columns, please see the code below:
#adding columns to new,edited csv
data["Octet 1"] = octet_1
data["Octet 2"] = octet_2
data["Octet 3"] = octet_3
data["Octet 4"] = octet_4
data.to_csv('INSERT_FILEPATH_TO_NEW_CSV') #creates csv with new columns. If you want to update the original file, please insert the filepath to the original file.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Dharman |
