'Bug when indexing date column in Pandas

I'm trying to make pandas recognise the first column as a date.

import csv
import pandas as pd
import plotly.express as px
cl = open('cl.csv')
cl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date'])
cl.info()

Then to visualise the price:

fig = px.line(cl, y="Adj Close", title='Crude Oil Price', labels = {'Adj Close':'Crude Oil Price(in USD)'})

But it gives back a ruined chart:

Date indexed chart

If I comment out 'parse_dates=['Date'], index_col=['Date'])' and just leave 'cl = pd.read_csv('CL.csv')' the chart will look just fine.

Chart without date

What am I doing wrong here?



Solution 1:[1]

If you print c1 out and the dates look fine, then the reason behind the graph could likely be that your c1 wasn't sorted by Date, do the following before visualizing it:

c1 = c1.sort_values('Date')

Solution 2:[2]

  I think this problem can be caused by the type of date format that column contains ('Date'), so researching the documentation, I quote the following: For non-standard datetime parsing, use pd.to_datetime after pd.read_csv. To parse an index or column with a mixture of timezones, specify date_parser to be a partially-applied pandas.to_datetime() with utc=True. See Parsing a CSV with mixed timezones for more, then you could replace cl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date']) with cl = pd.read_csv('CL.csv', parse_dates=['Date'], date_parser=lambda col: pd.to_datetime(col, utc=True))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Raymond Kwok
Solution 2 ellhe-blaster