'Add name to dataframe pandas
I am having data frame as follows:
Date Description Amount
12/01/2019 ABC 51,026
13/01/2019 XYZ 12,111.56
I want to add a single name field to the whole data frame having 3 columns "Date Description and Amount" and want to get the output as follows:
Name: Shivam
Date Description Amount
12/01/2019 ABC 51,026
13/01/2019 XYZ 12,111.56
I have tried adding a dictionary of name to one data frame and "Date Description and Amount" to another Data Frame and merged both. But it is creating another column of name along with "Date Description and Amount". But I want name to be only one time and not the whole column.
Please suggest some ways to do this.
Solution 1:[1]
You can use the metadata to add raw attributes.
import pandas as pd
class metadatadf(pd.DataFrame):
_metadata = ['metadata']
@property
def _constructor(self):
return metadatadf
data = {"Date": ['12/01/2019', " 13/01/2019"], "Description": ['ABC', 'XYZ'] ,"Amount":['51,026','12,111.56']}
df = metadatadf(data)
df.metadata = " Name:Shivam"
print(df.head().metadata)
print(df)
The output:
Name:Shivam
Data Description Amount
0 12/01/2019 ABC 51,026
1 13/01/2019 XYZ 12,111.56
Also, you can use the MultiIndex, but if you need to save it to a file, it will show duplicated columns
col=pd.MultiIndex.from_product([["Shivam"], ['Date', 'Description', 'Amount']], names=['Name:',''])
df = pd.DataFrame(data=[['12/01/2019', 'ABC', "51,026"], ['13/01/2019', 'XYZ', "12,111.56"]],columns=col)
print(df)
The output:
Name:Shivam
Data Description Amount
0 12/01/2019 ABC 51,026
1 13/01/2019 XYZ 12,111.56
So, according to this answer, you need to use the HDF5 to save the metadata
import numpy as np
import pandas as pd
def h5store(filename, df, **kwargs):
store = pd.HDFStore(filename)
store.put('mydata', df)
store.get_storer('mydata').attrs.metadata = kwargs
store.close()
def h5load(store):
data = store['mydata']
metadata = store.get_storer('mydata').attrs.metadata
return data, metadata
a = pd.DataFrame(
data=[['12/01/2019', 'ABC', "51,026"], ['13/01/2019', 'XYZ', "12,111.56"]], columns=['Date', 'Description', 'Amount'])
filename = 'fs.h5'
metadata = dict(Name='Shivam')
h5store(filename, a, **metadata)
with pd.HDFStore(filename) as store:
data, metadata = h5load(store)
print(data)
print(metadata)
Solution 2:[2]
I believe this is one way to do what your question describes:
import pandas as pd
mi = pd.MultiIndex.from_product([['Name: Shivam'], 'Date,Description,Amount'.split(',')])
print(mi)
df = pd.DataFrame([['12/01/2019', 'ABC', 51026], ['13/01/2019', 'XYZ', 12111.56]], columns=mi)
print(df)
Output:
MultiIndex([('Name: Shivam', 'Date'),
('Name: Shivam', 'Description'),
('Name: Shivam', 'Amount')],
)
Name: Shivam
Date Description Amount
0 12/01/2019 ABC 51026.00
1 13/01/2019 XYZ 12111.56
UPDATE:
You can also use a regular index for the columns (not multiindex) with a name stored in the index. It should display the name one time only:
import pandas as pd
idx = pd.Index('Date,Description,Amount'.split(','), name='Name: Shivam')
df = pd.DataFrame([['12/01/2019', 'ABC', 51026], ['13/01/2019', 'XYZ', 12111.56]], columns=idx)
print(df)
Output:
Name: Shivam Date Description Amount
0 12/01/2019 ABC 51026.00
1 13/01/2019 XYZ 12111.56
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
