'matplotlib multiple lines for multiple years on same chart

I am analyzing consumptions and want to build graphs which looks like this: E.g. Gas enter image description here

  • On the y axis there is the consupmtion in kWh
  • On the x axis are the months (Jan-Dec)
  • Each line represents a year

The data that I have looks like this (its quite fragmented for some years I have a lot of data points for some only one or two:

Datum Art Amount
06.03.2022 Wasser 1195
06.03.2022 Strom 8056
06.03.2022 Gas 27079,019
09.02.2022 Wasser 1187
09.02.2022 Strom 7641
09.02.2022 Gas 26845,138
10.01.2022 Strom 6897

You can download from here

My code looks like this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from pathlib import Path

file_path_out_2021_2022 = "../data/raw_Consumption_data.xlsx"
df = pd.read_excel(file_path_out_2021_2022)

than some calculations

#rename columns
df = df.rename(columns={'Datum':'DATE', 'Art':'CATEGORY', 'Amount':'AMOUNT'})
#convert date
df['DATE'] = pd.to_datetime(df['DATE'], format = "%Y-%m-%d", dayfirst=True)

df['MONTH'] =  df['DATE'].dt.month
df['YEAR'] =  df['DATE'].dt.year

df_gas = df[(df['CATEGORY'] == "Gas")]
df_gas = df_gas.fillna(0)
df_gas['PREV_DATE'] = df_gas['DATE']
df_gas['PREV_DATE'] = df_gas.PREV_DATE.shift(-1)
df_gas['PREV_AMOUNT'] = df_gas['AMOUNT']
df_gas['PREV_AMOUNT'] = df_gas.PREV_AMOUNT.shift(-1)

df_gas['DIFF_AMOUNT'] = df_gas['AMOUNT'] - df_gas['PREV_AMOUNT'] 
df_gas['DIFF_DATE'] = df_gas['DATE'] - df_gas['PREV_DATE']

df_gas['DIFF_DAY'] = df_gas['DIFF_DATE'].dt.days

df_gas['CONSUM_PER_DAY'] = df_gas['DIFF_AMOUNT'] / df_gas['DIFF_DAY']

df_gas['CONSUM_PER_DAY_KWH'] = df_gas['CONSUM_PER_DAY'] * 10.3
df_gas['CONSUM_PER_MONTH_KWH'] = df_gas['CONSUM_PER_DAY_KWH'] * 30
df_gas['CONSUM_PER_YEAR_KWH'] = df_gas['CONSUM_PER_MONTH_KWH'] * 12

and than the chart:

import pandas as pd
from datetime import datetime, timedelta
from matplotlib import pyplot as plt
from matplotlib import dates as mpl_dates

plt.style.use('seaborn')

#drop all rows which have 0 in all columns
df_gas = df_gas.loc[(df_gas!=0).any(axis=1)]

df_gas.sort_values(by='DATE', ascending=False, inplace=True)

#print(df_gas)

dates = df_gas['DATE']
y = df_gas['AMOUNT']

plt.plot_date(dates, y, linestyle='solid')

plt.tight_layout()

plt.show()


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source