'Creating conditional fields based upon previous month values
I'm currently doing some work on a dataset that contains monthly data about the subscription status of users, where a given row shows whether a user is subscribed on a given month(shown as truncated date) in the following format:
+------------+------+---------------------+
| month | user | subscription_status |
+------------+------+---------------------+
| 2022-02-01 | 2345 | true |
| 2022-01-01 | 2345 | false |
| 2021-12-01 | 2345 | false |
| 2022-02-01 | 4578 | true |
| 2022-02-01 | 9548 | false |
| 2022-02-01 | 1212 | true |
|...... |
+------------+------+---------------------+
What I would like to do is query this data, but also generate a number of extra boolean fields whose values are conditional on the previous monthly subscription data for a given user, e.g:
never_subscribed_ever(subscription_status is false for each month for a user)never_subscribed_last_6_months(subscription_status is false for the last 6 months only)first_subscribed(subscription_status is true for the first time)resubscribed(subscription_status is true after being false last month)
to give a table such as :
+------------+------+---------------------+-----------------------+--------------------------------+------------------+--------------+
| month | user | subscription_status | never_subscribed_ever | never_subscribed_last_6_months | first_subscribed | resubscribed |
+------------+------+---------------------+-----------------------+--------------------------------+------------------+--------------+
However I'm afraid that I am quite lost on how best to implement the logic which I plan to do using the lag function along with conditionals such as if/case when needed.
Can anyone help me get started on the best way to implement lag logic on a user level looking at various past timeframes per field?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
