'How to organize data frame with several variables in Python?
When I organize data frame with 1 variable, it works well.
import numpy as np
a = np.random.normal(45, 9, 10000)
source = {"Genotype": ["CV1"]*10000, "AGW": a}
df=pd.DataFrame(source)
df
However, when I add more variables, it does not work.
import numpy as np
a = np.random.normal(45, 9, 10000)
b = np.random.normal(35, 10, 10000)
source = {"Genotype": ["CV1"]*10000 + ["CV2"]*10000,
"AGW": a + b}
df=pd.DataFrame(source)
df
and it says "ValueError: All arrays must be of the same length"
I think the AGW column calculates actual a + b which results in 10,000 rows, not array numbers vertically. I want to make data frame with two columns with 20,000 rows.
Could you let me know how to do it?
Thanks!!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
