'Getting alternating results with pandas melt

I was trying to convert the first image in this album into the second with pandas but all I got was the third one...

  1. Original
Year    Jan     Feb     Mar     Apr     May     Jun     Jul     Aug     Sep     Oct     Nov     Dec
1981    453.1   126.3   5.8     47.1    25.3    16.8    0       1.1     4.4     17.8    52.5    72.4
1982    211.4   23.1    231.2   0.8     0.2     0       0       0       15.3    0.9     8.6     59.9
1983    45.2    22.1    537.7   22.8    29.9    0       0       0.1     0.7     1.2     47      20.9
1984    390.2   514.2   140.3   7.3     0       0       2.8     0.1     0       18.3    23.2    91.7
  1. Wanted result:
Year    Month   Value
1981    Jan     453.1
1981    Feb     126.3
1981    Mar     5.8 
1981    Apr     47.1
...
  1. What I have:
Year    Month   Value
1981    Jan     453.1
1982    Jan     211.4
1983    Jan     45.2
1984    Jan     390.2
...

My code is just this:

data = pd.read_csv("Burketown.csv")
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
fixed_data = pd.melt(data, id_vars=['Year'], value_vars=months)
print(fixed_data)

How can I alternate the months for each year instead of what I got? I want to have the first year with every month, then the second year with every month... Not the first month with every year.



Solution 1:[1]

Use ordered catagorical for columns names, so possible DataFrame.sort_values by both columns for correct ordering:

months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
data = pd.read_csv("Burketown.csv", index_col=['Year'])[months]

data.columns = pd.CategoricalIndex(data.columns, ordered=True, categories=months)
df = data.reset_index()[months]
fixed_data = (pd.melt(data, id_vars=['Year'], value_vars=months)
                .sort_values(['Year', 'variable']))

Or convert Series to ordered categorical:

data = pd.read_csv("Burketown.csv")
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
fixed_data = pd.melt(data, id_vars=['Year'], value_vars=months)
fixed_data['variable'] = (pd.CategoricalIndex(fixed_data['variable'],
                                              ordered=True, 
                                              categories=months))

fixed_data = fixed_data.sort_values(['Year', 'variable'])

Solution 2:[2]

One option where you do not need to convert to categorical dtype, and maintain the order is with pivot_longer from pyjanitor

# pip install pyjanitor
import pandas as pd
import janitor

result = df.pivot_longer('Year', sort_by_appearance=True)

result.head(10)

   Year variable  value
0  1981      Jan  453.1
1  1981      Feb  126.3
2  1981      Mar    5.8
3  1981      Apr   47.1
4  1981      May   25.3
5  1981      Jun   16.8
6  1981      Jul    0.0
7  1981      Aug    1.1
8  1981      Sep    4.4
9  1981      Oct   17.8

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 sammywemmy