'How to remove additional index when using .mean(), .median(), .mode() in python on a pandas dataframe
I am calculating the mode/median/mean of pandas df columns using .mean(), .median(), .mode() but when doing so an index appears in some of the results:
def largeStats(dataframe):
dataframe.drop(dataframe.index[dataframe['large_airport'] != 'Y'], inplace=True)
mean = dataframe['frequency_mhz'].mean()
mode = dataframe['frequency_mhz'].mode()
median = dataframe['frequency_mhz'].median()
print("The mean freq of large airports is", mean)
print("The most common freq of large airports is", mode)
print("The middle freq of large airports is", median)
print(largeStats(df))
returns:
The mean freq of large airports is 120.00752293577986
The most common freq of large airports is 0 121.75
1 122.10
dtype: float64
The middle freq of large airports is 121.85
None
I want it to simply return the number for each:
The mean freq of large airports is 120.00752293577986
The most common freq of large airports is 121.75 & 122.10
The middle freq of large airports is 121.85
I know the indexing is in place due to 2 mode values but how would I remove that indexing?
Solution 1:[1]
This would fix it,
mode = dataframe['frequency_mhz'].mode().values[0]
The mode() function gives back a pandas series. So this would allow you to access the item in that series.
Solution 2:[2]
You can turn a pandas into a numpy array using the .values property:
mode = dataframe['frequency_mhz'].mode().values
should give you what you want.
Solution 3:[3]
Because Series.mode can return one or more values, need filter first value for scalar:
The mode is the value that appears most often. There can be multiple modes.
print("The most common freq of large airports is", mode.iat[0])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Zero |
| Solution 2 | LukasNeugebauer |
| Solution 3 | jezrael |
