'Pandas create a new column containing index based on the value of another one

I have a dataframe like this:

a


4.0
5.5
5.5
6.7
7.9
7.9
9.4

I want to a add a new column named b, 'indexing' the values in first one. The new dataframe would look like:

a   b

4.0 1
5.5 2
5.5 2
6.7 3
7.9 4
7.9 4
9.4 5

Thank you.



Solution 1:[1]

You can use pd.factorize:

codes, uniques = pd.factorize(df['a'])

df['b'] = codes

(or df['b'] = codes + 1 if you want these indexes to start at 1 instead of 0)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Carlos Bergillos