'How to group a pyspark dataframe and use a shift operator as aggregation method?

I have the following dataframe :

ride	window_left	window_right	time
1	No	No	1
1	No	Yes	2
1	Yes	Yes	3
1	Yes	Yes	4
2	No	No	1
2	Yes	No	2
2	Yes	Yes	3
2	Yes	Yes	4
2	Yes	Yes	5

And I would like to group this pyspark dataframe by the column ride and shift either window_left or window_right according to which one takes the value Yes first, like that :

ride	window_left	window_right	time
1	No	No	1
1	Yes	Yes	2
1	Yes	Yes	3
1	None	Yes	4
2	No	No	1
2	Yes	Yes	2
2	Yes	Yes	3
2	Yes	Yes	4
2	Yes	None	5

I would like to use pyspark only to perform this transformation. I cannot use pandas, and that's why I have some difficulties. The tricky part for me is to shift after grouping, knowing that the shift depends on the group.

Any help would be appreciated, thanks !

python pyspark

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to group a pyspark dataframe and use a shift operator as aggregation method?

Sources

Related Questions