'Series not showing up on plots

I've been trying to work through the code in this function and cannot get my series to show up on my plots. Possibly there is an easier way to do this. In each plot I want display each of the 7 entities, in a time series with 1 indicator.

I'm struggling with how to group values by both year, and country. I am new to python and data science so I appreciate any help.

Here is a link to the csv data from the World Bank https://datacatalog.worldbank.org/search/dataset/0037712

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


%matplotlib inline
plt.style.use('fivethirtyeight')
plt.rcParams['figure.figsize'] = (14, 7)

raw = pd.read_csv('WDIData.csv')

countries = ['BIH', 'HRV', 'MKD', 'MNE', 'SRB', 'SVN', 'EUU']

colors = {
    'Bosnia and Herzegovina': "#66C2A5",
    'Croatia': "#FA8D62",
    'North Macedonia': "#F7BA20",
    'Montenegro': "#E68AC3",
    'Serbia': "#8D9FCA",
    'Slovenia': "#A6D853",
    'avg. EU': "#CCCCCC"
}

i = 0

df = raw[raw['Country Code'].isin(countries)].copy()
pre_1990 = [str(x) for x in range(1960, 1990)]
df.drop(pre_1990, axis=1, inplace=True)

df = df.rename(columns={'Country Name': 'CountryName', 'Country Code': 'CountryCode', 'Indicator Name': 'IndicatorName', 'Indicator Code': 'IndicatorCode'})
columns = ['CountryName', 'CountryCode', 'IndicatorName', 'IndicatorCode']
df = pd.melt(df, id_vars=columns, var_name='Year', value_name='Value')

df.dropna(inplace=True)

def plot_indicator(indicators, title=None, 
                   xlim=None, ylim=None, xspace=None,
                   loc=0, loc2=0,
                   drop_eu=False, filename=None):
    
    lines = ['-', '--']
    line_styles = []
    fig, ax = plt.subplots()
    
    indicators = indicators if isinstance(indicators, list) else [indicators]
    
    for line, (name, indicator) in zip(lines, indicators):
        ls, = plt.plot(np.nan, linestyle=line, color='#999999')
        line_styles.append([ls, name])

        df_ind = df[(df.IndicatorCode == indicator)]
        group = df_ind.groupby(['CountryName'])
        
        for country, values in group:
            country_values = values.groupby('Year').mean()
            
            if country == 'European Union':
                if drop_eu:
                    continue
                ax.plot(country_values, label=country, 
                        linestyle='--', color='#666666', linewidth=1, zorder=1)
            elif country_values.shape[0] > 1:
                ax.plot(country_values, label=country, linestyle=line,
                        color=colors[country], linewidth=2.5)
        
        if line == lines[0]:
            legend = plt.legend(loc=loc)

    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    if xlim and xspace:
        ax.set_xticks(np.arange(xlim[0], xlim[1]+1, xspace))
    
    plt.tight_layout()
    fig.subplots_adjust(top=0.94)
    
    
    if title:
        ax.set_title(title)
    else:
        ax.set_title(df_ind.IndicatorName.values[0])
    
    if len(indicators) > 1:
        plt.legend(*zip(*line_styles), loc=loc2)
        ax.add_artist(legend)

population = [
    ('pop_dens', 'EN.POP.DNST'),     # Population density 
    ('rural', 'SP.RUR.TOTL.ZS'),     # Rural population 
    ('under14', 'SP.POP.0014.TO.ZS'),# Population, ages 0-14 
    ('above65', 'SP.POP.65UP.TO.ZS'),# Population ages 65 and above 
]

for indicator in population:
    plot_indicator(indicator, loc=0, xlim=(1990, 2020))





Head of the data set



Solution 1:[1]

EDIT

I have re-written this answer to be more clear and concise.

This is a clever bit of code! I found the problem, it was with xlim. As the years are strings, not integers, the x-axis is index-based, not integer-based. This means that when you pass the range between 1990 and 2020 you are looking the 1990th to 2020th values! Obviously, there are not this many values (only 30 years between 1990 and 2020), so there was no data within that range, thus the blank plot.

If you change the code within the function to ax.set_xlim(xlim[0]-int(df_ind['Year'].min()), xlim[1]-int(df_ind['Year'].min())) then you can pass the year and it will subtract the minimum year to give the appropriate index values. I would also add plt.xticks(rotation=45) underneath to stop the ticks overlapping.

ALTERNATIVELY!! (this is the option I would choose):

You can simply change the DataFrame column type to integer, then everything you have remains unchanged. Underneath df.dropna(inplace=True) (just before the function), you can add df['Year'] = df['Year'].astype(int), which solves the problem with the non-integer x-axis above.

Once one or the other has been changed, you should be able to see the lines of the plots.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1