'Convert a list of dictionaries (with a nested list as values) into a data frame
I would really appreciate some help with this. I am extracting headings and associated list of words under each heading from a website. I have ended up with a list of dictionaries with a value list for each dictionary key:
[{'You Led a Project': "['Chaired', 'Controlled', 'Coordinated', 'Executed', 'Headed', 'Operated', 'Orchestrated', 'Organized', 'Oversaw', 'Planned', 'Produced', 'Programmed']"}, {'You Envisioned and Brought a Project to Life': "['Administered', 'Built', 'Charted', 'Created', 'Designed', 'Developed', 'Devised', 'Founded', 'Engineered', 'Established', 'Formalized', 'Formed', 'Formulated', 'Implemented', 'Incorporated', 'Initiated', 'Instituted', 'Introduced', 'Launched', 'Pioneered', 'Spearheaded']"}, {'You Saved the Company Time or Money': "['Conserved', 'Consolidated', 'Decreased', 'Deducted', 'Diagnosed', 'Lessened', 'Reconciled', 'Reduced', 'Yielded']"}, {'You Increased Efficiency, Sales, Revenue, or Customer Satisfaction': "['Accelerated', 'Achieved', 'Advanced', 'Amplified', 'Boosted', 'Capitalized', 'Delivered', 'Enhanced', 'Expanded', 'Expedited', 'Furthered', 'Gained', 'Generated', 'Improved', 'Lifted', 'Maximized', 'Outpaced', 'Stimulated', 'Sustained']"}, {'You Changed or Improved Something': "['Centralized', 'Clarified', 'Converted', 'Customized', 'Influenced', 'Integrated', 'Merged', 'Modified', 'Overhauled', 'Redesigned', 'Refined', 'Refocused', 'Rehabilitated', 'Remodeled', 'Reorganized', 'Replaced', 'Restructured', 'Revamped', 'Revitalized', 'Simplified', 'Standardized', 'Streamlined', 'Strengthened', 'Updated', 'Upgraded', 'Transformed']"}
I now want to convert the list into a data frame, such that each item in the value lists are assigned a key. For example:
Column 1 Column 2
You Led a Project Chaired
You Led a Project Controlled
....
Please my find the code I have so far below:
def extract_verbs():
headers = {'User Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) \
Chrome/100.0.4896.127 Safari/537.36'}
url = "https://www.themuse.com/advice/185-powerful-verbs-that-will-make-your-resume-awesome"
r2 = requests.get(url, headers)
verb_soup = BeautifulSoup(r2.text, "html.parser")
return verb_soup
def transform_verbs(verb_soup):
start_numbers = [1, 13, 34, 43, 62, 88, 108, 114, 123, 143, 162, 174]
section_headings = verb_soup.select("h2")
for n in range(0,len(start_numbers)-1):
heading = section_headings[n].getText().split()[3:]
heading = ' '.join(heading)
verbs = [item.text for item in verb_soup.find("ol", {"start": start_numbers[n]}).find_all("li")]
all_verbs = {f"{heading}": f"{verbs}"}
all_verbs_list.append(all_verbs)
return
all_verbs_list = []
extract_verb_ouput = extract_verbs()
transform_verbs(extract_verb_ouput)
Thank you in advance!
Solution 1:[1]
You can transfer the data into a list of records (rows), then construct a dataframe with .from_records():
records = list()
for elem in data:
for key, value in elem.items():
for v in value:
records.append([key, v])
>>> records
[['You Led a Project', 'Chaired'],
['You Led a Project', 'Controlled'],
['You Led a Project', 'Coordinated'],
['You Led a Project', 'Executed'],
...
>>> pd.DataFrame.from_records(records, columns=['Column1', 'Column2'])
Column1 Column2
0 You Led a Project Chaired
1 You Led a Project Controlled
2 You Led a Project Coordinated
3 You Led a Project Executed
4 You Led a Project Headed
.. ... ...
82 You Changed or Improved Something Streamlined
83 You Changed or Improved Something Strengthened
84 You Changed or Improved Something Updated
85 You Changed or Improved Something Upgraded
86 You Changed or Improved Something Transformed
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | fsimonjetz |
