'Create dataframe with new columns derived from unique values in a single column

I have a dataframe formatted like this:

id fieldname fieldvalue
1 PC Dell
1 Phone Pixel 6
2 PC Lenovo
3 Phone Samsung

I would like to transform it to :

id PC Phone
1 Dell Pixel6
2 Lenovo
3 Samsung

In other words, create one column per distinct value in column fieldname, fill it with corresponding value from fieldvalue.

How would I do that in pyspark ?



Solution 1:[1]

This is a row to column problem should use pivot.

df = df.groupBy('id').pivot('fieldname').agg(F.first('fieldvalue'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 过过招