'Checking previous rows to see if value exists
The dataset has several rows for each customer ID, along with TS.
For each Customer ID, I want to check the Status column to check if it ever held Cancelled value in any of that Customer ID's previous rows. (ordered by TS)
| CustomerID | Status | TS |
|---|---|---|
| Vimes | CANCELLED | Jan 1 |
| Vimes | ACTIVE | Jan 2 |
| Vimes | CANCELLED | Jan 3 |
| Sybill | ACTIVE | Jan 2 |
| Sybill | ACTIVE | Jan 5 |
| Sybill | ACTIVE | Jan 6 |
The result set should look like this, an added column with the flag Rejoiner which checks for previous values of the Status column -
| CustomerID | Status | TS | Rejoiner |
|---|---|---|---|
| Vimes | CANCELLED | Jan 1 | No |
| Vimes | ACTIVE | Jan 2 | Yes |
| Vimes | CANCELLED | Jan 3 | Yes |
| Sybill | ACTIVE | Jan 2 | No |
| Sybill | ACTIVE | Jan 5 | No |
| Sybill | CANCELLED | Jan 6 | No |
Solution 1:[1]
Use below
select *,
if(countif(status = 'CANCELLED') over win > 0, 'Yes', 'No') as Rejoiner
from your_table
window win as (partition by customerid order by unix_date(date(ts)) range between unbounded preceding and 1 preceding)
if applied to sample data in your question - output is
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mikhail Berlyant |

