'Correct way of adding new columns/headers to a dataframe

I need to add new columns to a dataframe. Every column has a header and a value across all the rows (the value is the same for all the columns).

Right now im doing something like this:

array_of_new_headers = [...]
for column in array_of_new_headers:
   df[column] = 0

As a result I'm getting this message:

PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()

It tells me to use concat, but, I don't need to concatenate two dataframes really, should I use concat for better performance and better code? To me it doesn't really make sense unless I think of the arrays as also dataframes maybe.

python pandas

Solution 1:^[1]

You can pass an unpacked dictionary with keys as column names, and values as value for the columns to pandas.DataFrame.assign :

>>> array_of_new_headers = [...]
>>> df.assign(**{c:0 for c in array_of_new_headers})

But the operation is immutable, so make sure to assign it back to the required variable.

Solution 2:^[2]

should I use concat for better performance

Beware so-called premature optimization, if your code does work rapidly enough for your needs then you might end simply wasting your time on trying to make it faster.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	ThePyGuy
Solution 2	Daweo

'Correct way of adding new columns/headers to a dataframe

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]