'how to choose columns when calculating mean
Hi I'm a student learning python.
what's the difference between
df.1.mean()
df[1].mean()
?
the full code is
df= pd.DataFrame(np.random.randn(10,4))
df[1].mean()
I'm confused because I used the first method to choose a column in a different data frame before.
Solution 1:[1]
If the name of the column is a string such as "one" then it will work, as df.one is the attribute "one" of the df. Unfortunately the attribute syntax does not work pure integers (numbers) and only can be called by the squared brackets as df[1] where they are handled correctly.
df = pd.DataFrame({1:[2,3], 'one':[3,5]})
df.one #works
#df.1 # syntax error
Solution 2:[2]
For columns that are numbers, it will result in a error if you call df.1.
You can use it however for column names that are string.
# create new column with string column name
df['new_col'] = np.random.randn()
# get mean
df.new_col.mean()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ziur Olpa |
| Solution 2 | greco |
