'Scala: Window.partitionBy() and lag function to determine largest % increment from previous day

I have data that is being read in from two separate .csv's with the following headers:

df1:

ID1 (integer), ID2 (integer), person_count (integer), distance (float), amount (float), payment_type (integer), datetime_origin (string), datetime_dest (string)

df2:

ID1 (integer), city (string), zone (string), zone_service (string)

I need to find the 3 days of the month April where city = "LA", that saw the largest percentage increment in pickups compared to previous day.

Any ideas on where to begin solving this? I think I might need a lag function over a Window.partitionBy() for the day of the month, but not sure as far as implementation goes.

Thanks!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Scala: Window.partitionBy() and lag function to determine largest % increment from previous day

Sources

Related Questions