'How to merge all row values into a single column value based on column partition in PySpark Dataframe

Say I have a dataframe as below : DF :

**Name | Fruits| **
Jon | Apple|
Joseph | Orange|
Mark | Apple|
Jon | Orange|
Jim | Apple|

Expected Output :

**Name | Fruits **
Jon,Mark,Jim| Apple |
Jon, Joseph | Orange|

How can I merge all of them based on partitioning Fruits.

Can anyone help on how this can be achieved?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source