'How can I plot a pandas dataframe where x = month and y = frequency of text?

I have the following dataset:

Date	ID	Fruit
2021-2-2	1	Apple
2021-2-2	1	Pear
2021-2-2	1	Apple
2021-2-2	2	Pear
2021-2-2	2	Pear
2021-2-2	2	Apple
2021-3-2	3	Apple
2021-3-2	3	Apple

I have removed duplicate "Fruit" based on ID (There can only be 1 apple per ID number but multiple apples per a single month). And now I would like to generate multiple scatter/line plots (one per "Fruit" type) with the x-axis as month (i.e. Jan. 2021, Feb. 2021, Mar. 2021, etc) and the y-axis as frequency or counts of "Fruit" that occur in that month.

If I could generate new columns in a new sheet in Excel that I could then plot as x and y that would be great too. Something like this for Apples specifically:

Month	Number of Apples
Jan 2021	0
Feb 2021	2
Mar 2021	1

I've tried the following which let me remove duplicates but I can't figure out how to count the number of Apples in the Fruit column that occur within a given timeframe (month is what I'm looking for now) and set that to the y-axis.

import numpy as np
import pandas as pd
import re
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_excel('FruitExample.xlsx',
                    usecols=("A:E"), sheet_name=('Data'))


df_Example = df.drop_duplicates(subset=["ID Number", "Fruit"], keep="first")

df_Example.plot(x="Date", y=count("Fruit"), style="o")
plt.show()

I've tried to use groupby and categorical but can't seem to count this up properly and plot it. Here is an example of a plot that would be great.

[ Example Plot ]

Solution 1:^[1]

Make sure the dates are in datetime format

df['Date']=pd.to_datetime(df['Date'])

Then create a column for month-year,

df['Month-Year']=df['Date'].dt.to_period('M') #M for month
new_df=pd.DataFrame(df.groupby(['Month-Year','Fruit'])['ID'].count())
new_df.reset_index(inplace=True)

Make sure to change back to datetime as seaborn can't handle 'period' type

new_df['Month-Year']=new_df['Month-Year'].apply(lambda x: x.to_timestamp())

Then plot,

import seaborn as sns
sns.lineplot(x='Month-Year',y='ID',data=new_df,hue='Fruit')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Reza Rahemtola

'How can I plot a pandas dataframe where x = month and y = frequency of text?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]