'Generating nested dictionary from a text file
I have a sample.txt file with following text.
abcd 10
abcd 1.1.1.1
abcd 2.2.2.2
abcd 3.3.3.3
wxyz 20
wxyz 1.1.1.1
wxyz 2.2.2.2
wxyz 4.4.4.4
I want to store the different values from each line into a dictionary with specific keys.
Output desired with dictionary is -
details = {
"customer_names" : ["abcd", "wxyz"],
"site_ids" : [10, 20],
"neighbors" : [["1.1.1.1", "2.2.2.2", "3.3.3.3"], ["1.1.1.1", "2.2.2.2", "4.4.4.4"]]
}
so that specific configuration can be done with for each customer having different site-id and different neighbors independently.
I have tried with different codes but ended up loading all the neighbors in one single list due to which unable to process correctly. Please help me out in preparing the dictionary with separate nested lists within neighbors keys.
Solution 1:[1]
You can use a for loop to loop though the data, and whenever it finds a line that starts with an id that doesn't already exist in the customer_names list, add it to the list along with the other necessary data to start filling up with the neighboring lines:
details = {
"customer_names" : [],
"site_ids" : [],
"neighbors" : []
}
with open("sample.txt") as f:
for line in f:
i, j = line.split()
if i in details["customer_names"]: # If id is in customer_names list
details["neighbors"][-1].append(j) # Add neighbor to last list in neighbors list
else: # Else
details["customer_names"].append(i) # Add id to customer_names list
details["neighbors"].append([]) # Add new list to end of neighbors list
details["site_ids"].append(int(j)) # Add number to site_ids list
sample.txt contains:
abcd 10
abcd 1.1.1.1
abcd 2.2.2.2
abcd 3.3.3.3
wxyz 20
wxyz 1.1.1.1
wxyz 2.2.2.2
wxyz 4.4.4.4
Resulting details contains:
{'customer_names': ['abcd', 'wxyz'],
'site_ids': [10, 20],
'neighbors': [['1.1.1.1', '2.2.2.2', '3.3.3.3'], ['1.1.1.1', '2.2.2.2', '4.4.4.4']]}
Solution 2:[2]
I read your data in the file and made a data frame (pandas) out of it. Here first got indexes where there are int values 10, 20. These indexes are used to get the name list. Next, the lists aaa, bbb are created.
import pandas as pd
df = pd.read_csv('ttt.txt', header=None, delim_whitespace=True)
index = [i for i in range(0, len(df[1])) if df[1][i].isdigit()]#Getting an index where there are integers.
name = df.iloc[index, 0].to_list()
aaa = df.loc[(df.index != index[0]) & (df.index != index[1]) & (df[0] == name[0])][1].to_list()
bbb = df.loc[(df.index != index[0]) & (df.index != index[1]) & (df[0] == name[1])][1].to_list()
details = {
"customer_names": name,
"site_ids": [int(df.iloc[index[0], 1]), int(df.iloc[index[1], 1])],
"neighbors": [aaa, bbb]
Output details
{'customer_names': ['abcd', 'wxyz'], 'site_ids': [10, 20], 'neighbors': [['1.1.1.1', '2.2.2.2', '3.3.3.3'], ['1.1.1.1', '2.2.2.2', '4.4.4.4']]}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
