'Generate single dataframe based on a dynamic number of dataframes
I'm having a little bit of a problem generating dataframes in Python.
For example:
df_btc = web.DataReader('BTC-USD', 'yahoo', start, end)
df_eth = web.DataReader('ETH-USD', 'yahoo', start, end)
crypto_data = pd.DataFrame({'BTC': df_btc['Adj Close'], 'ETH': df_eth['Adj Close']})
This works fine, but I want to have a list with dynamic variables from which to generate crypto_data from.
I'd like to create the following function:
import pandas as pd
import pandas_datareader as web
import datetime as dt
def generate_df(cryptolist, start, end):
#Here I create a dictionary with all the dataframes based on a list
crypto_dict = {}
for ele in list:
crypto_dict[ele] = web.DataReader(ele,'yahoo',start,end)
#missing here a way to generate crypto_data (see above) based on the dictionary
if __name__ == '__main__':
#I want this list (from which I create the dictionary) to have any number of variables
list = ['BTC-USD', 'ETH-USD']
start = dt.datetime(2009, 1, 1)
end = dt.datetime.now()
generate_df(list,start,end)
So basically I'm missing a way to convert the dictionary to the following form:
pd.DataFrame({list[0]: first_df_in_dic['Adj Close'], list[1]: second_df_in_dic['Adj Close'],..., list[-1]: last_df_in_dic['Adj Close']})
The result should be, for two variables (from first code):
Any idea how to do this?
Solution 1:[1]
You have 2 different problems here: how to build the dataframe and how to find the column names.
As you only gave 2 trivial examples I am not sure for the second part, so I will assume that you will provide a transformation function to convert the name from the list into the expected column name.
def generate_df(cryptolist, start, end, colname):
#Here I create a dictionary with all the dataframes based on a list
crypto_dict = {ele: web.DataReader(ele,'yahoo',start,end)
for ele in cryptolist}
# generate crypto_data (see above) based on the dictionary
crypt_data = pd.DataFrame({colname(ele): crypto_dict[ele]['Adj Close']})
return crypt_data
In your example, you could call it that way:
...
df = generate_df(list,start,end, lambda s: s[:3])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Serge Ballesta |

