'Clustering dataset of clients to obtain time series in Python (or R)

I have a data of clients consumption in Bytes ( how much data every client consume when using wifi internet on a device like phone, computer, TV) every 6 minutes, we collect data on how much bytes have been consumed by every device at each client's home.

The data is being over one month, i want to extract times series of consumption for clients. But first i need to classify clients that are active or non-active ( those who use certain devices unregulary or rarely) because they will cause noise. I also want to aggreagate the datetime variable to hourly by summing rxbytes in each hour.

| client mac adresse (wifi box) | device mac adresse | Type | rxbytes | date | | -------- | -------------- | ------ | ------ | ----- | 10:06:45:BC:46:D0 | 58:00:E3:94:8B:FD | Mobile | 2104383| 2022-02-27 21:52:22 | 10:06:45:BC:46:D0 | 5C:3C:27:49:3C:42 | Mobile |2703456| 2022-02-27 21:58:22 | 10:06:45:BC:46:D0 | 5C:3C:27:49:3C:42 | Mobile |28848| 2022-02-27 22:04:22 . . . | client mac adresse | device mac adresse | Type | rxbytes | date | | -------- | -------------- | ------ | ------ | ----- | 35:06:45:BC:46:D0 | E2:4B:93:C8:64:C4 | Computer | 10415587| 2022-02-12 13:15:03 | 35:06:45:BC:46:D0 | E2:4B:93:C8:64:C4 | Computer |11523912| 2022-02-12 13:21:04 | 35:06:45:BC:46:D0 | E2:4B:93:C8:64:C4 | Computer |160192| 22022-02-12 13:27:03 . . ect.

all clients and their devices are in this table.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source