'Getting another column from same row which has first non-null value in column
I have a SQL table like this and I want to find the average adjusted amt for products partitioned by store_id that looks like this
Here, I need to compute the adj_amt which is the product of the previous two columns. For this, I need to fill the nulls in the avg_quantity with the first non_null value in the partition. The query I use is below.
select
CASE WHEN av_quantity is null then
# the boolen here is for non-null values
first_value(av_quantity, True) over (partition by store_no order by product_id
range between current row and unbounded following
)
else av_quantity
end as adj_av_quantity
I'm having trouble with the SQL required to get the adjusted cost, since its not pulling the first non_null value for factor but still fetches it based on the same row for the adj_av_quantity. any thoughts on how I could do this?
FYI I've simplified the data here. The actual dataset is pretty huge (> 125 million rows with 800+ columns) so I won't be able to use joins and have to do this via window functions. I'm using spark-sql
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

