'How do I create an array from a grouping of row_number()?
I have code that uses row_number() partitioned by date. I would like to create an array that contains data grouped by the row_number that is partitioned by date.
example code is something like this:
w=Window.partitionBy('part_id', 'part_date').orderBy(col('timestamp').desc())
df2 =df.withColumn('row_num', row_number().over(w))
The above code works for creating the partition. I am not sure how to create the array grouped by date that yields the part_num.
I thought maybe something like this. (this code does not work, just an example)
.withColumn('array_prt_num' , array('part_num')).groupBy('row_num')
thoughts?
Image link to df output wish

Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
