'Populate empty pandas dataframe with specific conditions

I want to create a pandas dataframe where there are 5000 columns (n=5000) and one row (row G). For row G, 1 (in 10% of samples) or 0 (in 90% of samples).

import pandas as pd
df = pd.DataFrame({"G": np.random.choice([1,0], p=[0.1, 0.9], size=5000)}).T

I also want to add column names such that it is "Cell" followed by "1..5000":

	Cell1	Cell2	Cell3	Cell5000
G	0	0	1	0

pandas

Solution 1:^[1]

The columns will default to a RangeIndex from 0-4999. You can add 1 to the column values, and then use DataFrame.add_prefix to add the string "Cell" before all of the column names.

df.columns += 1
df = df.add_prefix("Cell")

print(df)
   Cell1  Cell2  Cell3 ...   Cell5000
G      0      0      0 ...          0

For a single-liner, you can also add 1 and prefix with "Cell" by converting the column index dtype manually.

df.columns = "Cell" + (df.columns + 1).astype(str)

To make a single row DataFrame, I would construct my data with numpy in the correct shape instead of transposing a DataFrame. You can also pass in the columns as you want them numbered and the index labelled.

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.choice([1,0], p=[.1, .9], size=(1, size)),
    columns=np.arange(1, size+1),
    index=["G"]
).add_prefix("Cell")

print(df)
   Cell1  Cell2  Cell3 ... Cell4999  Cell5000
G      0      0      0 ...        0         0

Solution 2:^[2]

Another Method could be:

size = 5000

pd.DataFrame.from_dict(
     {"G": np.random.choice([1,0], p=[0.1, 0.9], size=size)},
     columns=(f'Cell{x}' for x in range(1, size+1)),
     orient='index'
)

Output:

   Cell1  Cell2  Cell3  Cell4  Cell5  Cell6  Cell7  Cell8  Cell9  ...  Cell4992  Cell4993  Cell4994  Cell4995  Cell4996  Cell4997  Cell4998  Cell4999  Cell5000
G      0      0      0      0      0      1      0      1      0  ...         0         0         0         0         0         0         0         0         0

[1 rows x 5000 columns]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1
Solution 2	BeRT2me

'Populate empty pandas dataframe with specific conditions

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]