'How to plot data in panda dateframe to histogram?

I have a dataset containing various fields of users, like dates, like count etc. I am trying to plot a histogram which shows like count with respect to date, how should I do that?

The dataset:

enter image description here



Solution 1:[1]

Assuming you want to plot number of public likes by date, you could do something like this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('analysis.csv')

# convert text column to date time and keep only the date part  
df['created_at'] = pd.to_datetime(df['created_at'])
df['created_at'] = df['created_at'].dt.date

# group by date taking the sum of public_metrics.like_count
df1 = df.groupby(['created_at'])['public_metrics.like_count'].sum().reset_index()
df1 = df1.set_index('created_at')

# plot and show
df1.plot()
plt.show()

And this is the output you will get Plotting likes by date

Solution 2:[2]

Just to add something to the first answer: you could visualize only the likes count of a specific month by making a bar plot. In this way, maybe you have a plot that is "closer" to the idea of histogram that you want. For example, I did it for January month:

import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates

# Read and clean data
df = pd.read_csv('tweets_data.txt')
df['created_at'] = df['created_at'].str.replace(".000Z", "")
df.created_at

# Create a new dataframe with only two columns: data and number of likes
histogram_data = pd.concat([df[['created_at']],df[['public_metrics.like_count']]],axis=1)
January_values = histogram_data[histogram_data['created_at'].astype(str).str.contains('2018-01')] #histogram_data['created_at'].astype(str)
January_values
January_values.shape


dictionary = {}
for date, n_likes in January_values.itertuples(index=False):
    dictionary[date] = n_likes
print(dictionary)


# Create figure and plot space
fig, ax = plt.subplots(figsize=(12, 12))

# Add x-axis and y-axis
ax.bar(dictionary.keys(),
       dictionary.values(),
       color='purple')

# Set title and labels for axes
ax.set_xlabel('Date', fontsize = 20)
ax.set_ylabel('Counts', fontsize = 20)
ax.set_title('Tweets likes counts in January 2018', fontsize = 15, weight = "bold")

# Ensure a major tick for each week using (interval=1) 
ax.xaxis.set_major_locator(mdates.WeekdayLocator(interval=1))
ax.tick_params(axis='x', which='major', labelsize=15, width=2)
plt.setp( ax.xaxis.get_majorticklabels(), rotation=-45, ha="left", weight="bold")

plt.show()

The output is:

enter image description here

Of course, if you use all your data (that are more than 3000 dates), you will obtain a plot with bars really sharp...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Pankaj Saini
Solution 2