'Python: Fill nan values in column with the previous string, which changes every few rows

Can't find this in the q&a although feel it probably has been asked before, so please direct me if that's the case.

I have a df with around 10 columns and many rows. One of these columns is an identifier, let's say "Site". The data isn't well labelled, and only the first row of each site has the site identifier, with the following rows in the "Site" column being NaN until the next site is reached, in which case again only the first row if filled. eg:

   Site  data1  data2
    aa      2      3
    NaN     5      6
    NaN     2      3
    bb      5      6
    NaN     2      3
    cc      5      6
    NaN     2      3
    NaN     2      3
    NaN     2      3

I would like to fill the nan values with the previous string. The df is very large, so I can't do this manually for each one. The rest of the data in the df should remain untouched. So expected output:

   Site  data1  data2
    aa      2      3
    aa      5      6
    aa      2      3
    bb      5      6
    bb      2      3
    cc      5      6
    cc      2      3
    cc      2      3
    cc      2      3



Solution 1:[1]

df = pd.DataFrame({'Site': ['aa', np.nan, np.nan, 'bb', np.nan, 'cc', np.nan, np.nan, np.nan],
                   'data1': [2, 5, 2, 5, 2, 5, 2, 2, 2], 'data2': [3, 6, 3, 6, 3, 6, 3, 3, 3]})

print(df.fillna(method='ffill'))

filling in the "forward" direction

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1