'Inserting intermediate values between min value and max value without using loops

I have a very large dataset that requires me to use PySpark to process it. I have a table that looks like this. Here is a small sample:

minutes | hour 
718       11    
719       11
721       12 
722       12 
723       12 
779       12
781       13
782       13

What I need to do is have a table that calculates all the minute intervals and have the hour as the columns like this:

11 | 12 | 13 
2    60   2

First problem is the missing values. I will need to add the values (720, 11), (720,12), (780,12), (780, 13) in order to calculate the minute intervals for each hour. Once I add those values, I can do a group by hour and find the difference between the minimum minutes and maximum minutes. I can then do a pivot by hour.

Any ideas with appending those values without using loops or hardcoding? I just need to have this as the output.

minutes | hour 
718       11    
719       11
720*      11
720*      12
721       12 
722       12 
723       12 
779       12
780*      12
780*      13
781       13
782       13

apache-spark pyspark

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Inserting intermediate values between min value and max value without using loops

Sources

Related Questions