'how to choose columns when calculating mean

Hi I'm a student learning python.

what's the difference between

df.1.mean()
df[1].mean()

?

the full code is

df= pd.DataFrame(np.random.randn(10,4)) 
df[1].mean()

I'm confused because I used the first method to choose a column in a different data frame before.



Solution 1:[1]

If the name of the column is a string such as "one" then it will work, as df.one is the attribute "one" of the df. Unfortunately the attribute syntax does not work pure integers (numbers) and only can be called by the squared brackets as df[1] where they are handled correctly.

df = pd.DataFrame({1:[2,3], 'one':[3,5]})
df.one #works
#df.1 # syntax error

Solution 2:[2]

For columns that are numbers, it will result in a error if you call df.1.

You can use it however for column names that are string.

# create new column with string column name
df['new_col'] = np.random.randn()
# get mean 
df.new_col.mean()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ziur Olpa
Solution 2 greco