'How to compress dataframe to report temperature measurements on Yearly basis rather than on Monthly basis

I want to display the temperature change column with values on a yearly basis (rather than on monthly basis) by taking the yearly mean of the monthly values with subsequent discarding of the monthly column. I want to do this as the rest of my datasets quote values on a yearly basis and thus having monthly values does not mean anything. I tried a couple of ways but I did not get the required result. I wanted to ask if there is a way to compress the dataset and report values on a yearly basis. An image is attached for guidance: Image. Thanks.



Solution 1:[1]

it sounds like you need to use groupby() though how its implemented depends on how you have the date data stored

df = df.groupby(by=['Area', 'Year', 'Temperature Unit', 'Temperature Flag']).mean()

Due to the month code and area code being stored as an integers they will not be removed automatically in the groupby()

they can be removed with a simple drop() statement

df = df.drop(columns=['Months Code', 'Area Code'])

If you want to know more check out the docs

Edited to reflect the actual dataframe columns

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1