'How to extract info, sort and slice in a dictionary (python3) where values are lists
Project is to identify, rank and classify dividend paying stocks on multiple exchanges.
In a nutshell, program flow (using python 3.8) is as follows:
Stage 1 - read Master List of stocks on exchange from a spreadhseet. Stage 2 - download info and dividends from Yahoo Finance and run series of tests to identify those which meet the criteria and classified as targets now, or could be in the near/far future if one of the criteria changes. Stage 3 - Master List and Targets are saved in a spreadsheet, and targets are graphed to give visual clues to dividends trend and annual growth.
Previous versions of this program were done in 3 different scripts, and yahoo finance was called to download stock info for each stage. These worked just fine, but I am trying to consolidate all of this in 1 script.
As seen in the full program code provided below, I am going through my list of stocks and saving the information in a dictionary:
graph_dict = {'Name':[],'Symbol':[],'Avg Div Growth':[],'Dividends':[],'Annual Dividends':[]}
Name and symbol are strings, Avg Div Growth is a float, and Dividends / Annual Dividends are pandas dataframes
As the list of names is looped through, and yahoo finance obtains dividends, the dictionary is appended. Effectively, each 'row' in the dictionary contains all the information I require to graph the stock.
I have no problems populating the dictionary, but I have 3 problems when I try to access the dictionary.
- How do I extract a 'row' from the dictionary? I can access individual values from a specific 'row' using:
graph_dict['Name'][row_num]
so, graph_dict['Name'][0] is 'Keppel REIT in the code below.
I can not figure out how to extact all items in the row. I am looking for the equivalent to:
graph_dict.items()[row_num]
which would give me the name, symbol, avg div growth, all dividends and annual dividends for the stock in gthat row number.
- How can I sort the 'rows' by values in the key [Avg Div Growth]?
I want to graph the securities in order from highest average dividend growth to the lowest. This is how I am ranking the securities on the list.
Using sorted() : sorted_dict = sorted(graph_dict, key = lambda item: item['Avg Div Growth']))
and itemgetter() : sorted_dict = sorted(graph_dict, key = itemgetter('Avg Div Growth')))
does not work with my dictionary format.
Based on how I declared the dictionary, how can I sort the 'rows' within the dictionary? Or do I need to declare my dictionary differently?
- How to slice the dictionary so I can send a group of 'rows' to my graph function?
I am graphing the list of securities with a number of securities per page. So, I need to group or slice my dictionary so that I can send the group of securities and their dividend data to the graphing function and graph one page at a time (collection of pages is saved as a .pdf doc for future reference)
One answer I saw suggested the following:
group_to_graph = dict(list(sorted_graph_dict.items())[group_start:group_end])
where group start and end are the limits for the slice and reference the 'row' number in the dictionary.
doing that just gave me the values of the keys at group_start as follows:
Stock Group on page # 0 - start row (group_start) = 0 , (end row (group_end) = 1
{'Name': ['Keppel REIT', 'UMS', 'SPH REIT', 'Frasers Hospitality Trust']}
Stock Group on page # 1 - start row (group_start) = 2 , (end row (group_end) = 3
{'Avg Div Growth': [6.77852254732552, 25.0433491073197, 32.833907784854, -20.4956238784202]}
I suspect that once I understand how to access a complete 'row' in the dictionary vs. individual keys as in Q1 above that this may answer itself, but please feel free to peruse the code below and suggest options for how to declare my dictionary and access all items in a 'row'.
import pandas as pd
import yfinance as yf
## basic information on each stock
name = ['Keppel REIT','UMS','SPH REIT','Frasers Hospitality Trust']
symbol = ['K71U.SI','558.SI','SK6U.SI','ACV.SI']
avg_div_growth = [6.77852254732552,25.0433491073197,32.833907784854,-20.4956238784202]
## create dataframe to hold stock info (downloaded from yahoo finance)
df = pd.DataFrame (columns = ['Name','Symbol','Avg Div Growth (%)'])
df = df.astype( dtype = {'Name': str, 'Symbol': str, 'Avg Div Growth (%)': float})
df['Name'] = name
df['Symbol'] = symbol
df['Avg Div Growth (%)'] = avg_div_growth
## create dictionary to hold all relevant info for graphing. Each stock has its own 'row' in the lists within the dictionary
graph_dict = {'Name':[],
'Symbol':[],
'Avg Div Growth':[],
'Dividends':[],
'Annual Dividends':[]}
## download dividend info from Yahoo Finance using yfinance
start = 0
end = len(df)
for count in range (start, end): # loops through list of securities contained in the dataframe
stock = yf.Ticker(df.loc[count,'Symbol'])
div = stock.dividends # is type Series by default
div_df = pd.DataFrame(div) # change to Dataframe
## make annual dividends dataframe
annual_divs_df = div_df.resample('Y').sum()
annual_divs_df.rename(columns = {'Dividends':'Annual Dividends'}, inplace = True)
## populate each 'row' of the dictionary with the relevant data
graph_dict['Name'].append(df.loc[count,'Name'])
graph_dict['Symbol'].append(df.loc[count,'Symbol'])
graph_dict['Avg Div Growth'].append(df.loc[count,'Avg Div Growth (%)'])
graph_dict['Dividends'].append(div_df)
graph_dict['Annual Dividends'].append(annual_divs_df)
print ('\nNumber of Names in dictionary is:',len(graph_dict['Name'])) # testing
## loop through dictionary to print each 'row' for testing purposes
for num in range(len(graph_dict['Name'])):
print('\nStock Number:',num+1)
print('Name is:', graph_dict['Name'][num])
print('Symbol is:',graph_dict['Symbol'][num])
print('Avg Annual Div Growth is:',graph_dict['Avg Div Growth'][num],'%')
print('\nAll Dividends declared:\n',graph_dict['Dividends'][num])
print('\nAnnual Dividends:\n',graph_dict['Annual Dividends'][num])
## sort the dictionary by Avg Div Growth from highest to lowest
''' How to sort the dictionary lists such that they are in order from highest Avg Div Growth to lowest ??? '''
sorted_graph_dict = graph_dict # change once how to sort is figuired out
## get first row in Dictionary
''' How to extract all items in a single row of the list within the complete dictionary ??? '''
## group the entries in the dictionary into groups (pages in graph) of set size
graph_rows = 2
number_of_securities = len(sorted_graph_dict['Name'])
page_pad = number_of_securities % graph_rows
if page_pad == 0: # all pages have full number of securities
number_of_pages = number_of_securities // graph_rows
else: # last page needs to be padded out to send proper number of securities to graph function
number_of_pages = number_of_securities // graph_rows + 1
print ('\n\npage_pad = ',page_pad, 'number of pages is: ',number_of_pages) # testing
start_page = 0
end_page = number_of_pages
group_to_graph = {} # empty dictionary to hold group of stocks to send to graph function
for page_number in range(start_page, end_page):
group_start = (page_number + 1) * graph_rows - graph_rows
group_end = group_start + graph_rows -1
''' how to slice dictionary so 'rows' within group start and end are isolated and sent to graphing_function??? '''
group_to_graph = dict(list(sorted_graph_dict.items())[group_start:group_end]) # DOES NOT WORK AS INTENDED
print ('\nStock Group on page #',page_number + 1,' - start row (group_start) = ',group_start, ', (end row (group_end) =',group_end, '\n',group_to_graph)
'''
Should print out items in each row,instead it it prints:
Names (key = 0 which is same as group_start) on 1st time through loop, and
Avg Div Growth on ( key = 2 which is same as group_start) on 2nd time through loop
'''
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
