'How to cluster / group customers given certain rules (such as location and % composition of a specific field) (in python if possible)
I want to find a way to cluster/group customer (using python preferably) following the case below:
Given X customers, located at specific Y locations and related to certain Z division, create clusters compose of customers at the same location, where all clusters have the same percentual composition of divisions (let's say 60% of division 1, 30% of division 2, and 10% of division 3.
Below's a link to some dummy data. If somebody could help me, with some code or guidance to some references I could take a look at, I would really appreciated.
Solution 1:[1]
Consider below simple approach
select as value array_agg(t order by max_completed_on_timestamp desc)[offset(0)]
from your_table t
group by selected_equivalent_parent_id
if apply to sample data in your question - output is
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mikhail Berlyant |


