'How to merge all row values into a single column value based on column partition in PySpark Dataframe
Say I have a dataframe as below : DF :
**Name | Fruits| **
Jon | Apple|
Joseph | Orange|
Mark | Apple|
Jon | Orange|
Jim | Apple|
Expected Output :
**Name | Fruits **
Jon,Mark,Jim| Apple |
Jon, Joseph | Orange|
How can I merge all of them based on partitioning Fruits.
Can anyone help on how this can be achieved?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
