'Add name to dataframe pandas

I am having data frame as follows:

Date        Description Amount
12/01/2019  ABC         51,026
13/01/2019  XYZ         12,111.56

I want to add a single name field to the whole data frame having 3 columns "Date Description and Amount" and want to get the output as follows:

 Name: Shivam
 Date        Description Amount
 12/01/2019  ABC         51,026
 13/01/2019  XYZ         12,111.56

I have tried adding a dictionary of name to one data frame and "Date Description and Amount" to another Data Frame and merged both. But it is creating another column of name along with "Date Description and Amount". But I want name to be only one time and not the whole column.

Please suggest some ways to do this.



Solution 1:[1]

You can use the metadata to add raw attributes.

import pandas as pd

class metadatadf(pd.DataFrame):

    _metadata = ['metadata']

    @property
    def _constructor(self):
        return metadatadf

data = {"Date": ['12/01/2019', " 13/01/2019"], "Description": ['ABC', 'XYZ'] ,"Amount":['51,026','12,111.56']}
df = metadatadf(data)
df.metadata = " Name:Shivam"
print(df.head().metadata)
print(df)

The output:

 Name:Shivam
          Data Description     Amount
0   12/01/2019         ABC     51,026
1   13/01/2019         XYZ  12,111.56

Also, you can use the MultiIndex, but if you need to save it to a file, it will show duplicated columns

col=pd.MultiIndex.from_product([["Shivam"], ['Date', 'Description', 'Amount']], names=['Name:',''])
df = pd.DataFrame(data=[['12/01/2019', 'ABC', "51,026"], ['13/01/2019', 'XYZ', "12,111.56"]],columns=col)
print(df)

The output:

 Name:Shivam
          Data Description     Amount
0   12/01/2019         ABC     51,026
1   13/01/2019         XYZ  12,111.56

So, according to this answer, you need to use the HDF5 to save the metadata

import numpy as np
import pandas as pd

def h5store(filename, df, **kwargs):
    store = pd.HDFStore(filename)
    store.put('mydata', df)
    store.get_storer('mydata').attrs.metadata = kwargs
    store.close()

def h5load(store):
    data = store['mydata']
    metadata = store.get_storer('mydata').attrs.metadata
    return data, metadata

a = pd.DataFrame(
    data=[['12/01/2019', 'ABC', "51,026"], ['13/01/2019', 'XYZ', "12,111.56"]], columns=['Date', 'Description', 'Amount'])

filename = 'fs.h5'
metadata = dict(Name='Shivam')
h5store(filename, a, **metadata)
with pd.HDFStore(filename) as store:
    data, metadata = h5load(store)
print(data)
print(metadata)

Solution 2:[2]

I believe this is one way to do what your question describes:

import pandas as pd
mi = pd.MultiIndex.from_product([['Name: Shivam'], 'Date,Description,Amount'.split(',')])
print(mi)
df = pd.DataFrame([['12/01/2019', 'ABC', 51026], ['13/01/2019', 'XYZ', 12111.56]], columns=mi)
print(df)

Output:

MultiIndex([('Name: Shivam',        'Date'),
            ('Name: Shivam', 'Description'),
            ('Name: Shivam',      'Amount')],
           )
  Name: Shivam
          Date Description    Amount
0   12/01/2019         ABC  51026.00
1   13/01/2019         XYZ  12111.56

UPDATE:

You can also use a regular index for the columns (not multiindex) with a name stored in the index. It should display the name one time only:

import pandas as pd
idx = pd.Index('Date,Description,Amount'.split(','), name='Name: Shivam')
df = pd.DataFrame([['12/01/2019', 'ABC', 51026], ['13/01/2019', 'XYZ', 12111.56]], columns=idx)
print(df)

Output:

Name: Shivam        Date Description    Amount
0             12/01/2019         ABC  51026.00
1             13/01/2019         XYZ  12111.56

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2