'Generate single dataframe based on a dynamic number of dataframes

I'm having a little bit of a problem generating dataframes in Python.

For example:

df_btc = web.DataReader('BTC-USD', 'yahoo', start, end)
df_eth = web.DataReader('ETH-USD', 'yahoo', start, end)
crypto_data = pd.DataFrame({'BTC': df_btc['Adj Close'], 'ETH': df_eth['Adj Close']})

This works fine, but I want to have a list with dynamic variables from which to generate crypto_data from.

I'd like to create the following function:

import pandas as pd
import pandas_datareader as web
import datetime as dt

def generate_df(cryptolist, start, end):
#Here I create a dictionary with all the dataframes based on a list
crypto_dict = {}

for ele in list:
    crypto_dict[ele] = web.DataReader(ele,'yahoo',start,end)


#missing here a way to generate crypto_data (see above) based on the dictionary 

if __name__ == '__main__':
#I want this list (from which I create the dictionary) to have any number of variables
list = ['BTC-USD', 'ETH-USD']

start = dt.datetime(2009, 1, 1)
end = dt.datetime.now()

generate_df(list,start,end)

So basically I'm missing a way to convert the dictionary to the following form:

pd.DataFrame({list[0]: first_df_in_dic['Adj Close'], list[1]: second_df_in_dic['Adj Close'],..., list[-1]: last_df_in_dic['Adj Close']})

The result should be, for two variables (from first code):

Result

Any idea how to do this?



Solution 1:[1]

You have 2 different problems here: how to build the dataframe and how to find the column names.

As you only gave 2 trivial examples I am not sure for the second part, so I will assume that you will provide a transformation function to convert the name from the list into the expected column name.

def generate_df(cryptolist, start, end, colname):

    #Here I create a dictionary with all the dataframes based on a list
    crypto_dict = {ele: web.DataReader(ele,'yahoo',start,end)
                   for ele in cryptolist}
    
    # generate crypto_data (see above) based on the dictionary 
    crypt_data = pd.DataFrame({colname(ele): crypto_dict[ele]['Adj Close']})
    return crypt_data

In your example, you could call it that way:

...
df = generate_df(list,start,end, lambda s: s[:3])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Serge Ballesta