'How to plot data in panda dateframe to histogram?
Solution 1:[1]
Assuming you want to plot number of public likes by date, you could do something like this:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('analysis.csv')
# convert text column to date time and keep only the date part
df['created_at'] = pd.to_datetime(df['created_at'])
df['created_at'] = df['created_at'].dt.date
# group by date taking the sum of public_metrics.like_count
df1 = df.groupby(['created_at'])['public_metrics.like_count'].sum().reset_index()
df1 = df1.set_index('created_at')
# plot and show
df1.plot()
plt.show()
Solution 2:[2]
Just to add something to the first answer: you could visualize only the likes count of a specific month by making a bar plot. In this way, maybe you have a plot that is "closer" to the idea of histogram that you want. For example, I did it for January month:
import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates
# Read and clean data
df = pd.read_csv('tweets_data.txt')
df['created_at'] = df['created_at'].str.replace(".000Z", "")
df.created_at
# Create a new dataframe with only two columns: data and number of likes
histogram_data = pd.concat([df[['created_at']],df[['public_metrics.like_count']]],axis=1)
January_values = histogram_data[histogram_data['created_at'].astype(str).str.contains('2018-01')] #histogram_data['created_at'].astype(str)
January_values
January_values.shape
dictionary = {}
for date, n_likes in January_values.itertuples(index=False):
dictionary[date] = n_likes
print(dictionary)
# Create figure and plot space
fig, ax = plt.subplots(figsize=(12, 12))
# Add x-axis and y-axis
ax.bar(dictionary.keys(),
dictionary.values(),
color='purple')
# Set title and labels for axes
ax.set_xlabel('Date', fontsize = 20)
ax.set_ylabel('Counts', fontsize = 20)
ax.set_title('Tweets likes counts in January 2018', fontsize = 15, weight = "bold")
# Ensure a major tick for each week using (interval=1)
ax.xaxis.set_major_locator(mdates.WeekdayLocator(interval=1))
ax.tick_params(axis='x', which='major', labelsize=15, width=2)
plt.setp( ax.xaxis.get_majorticklabels(), rotation=-45, ha="left", weight="bold")
plt.show()
The output is:
Of course, if you use all your data (that are more than 3000 dates), you will obtain a plot with bars really sharp...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Pankaj Saini |
| Solution 2 |



