'How to conduct logical tests or mathematical operations in parallel cells across multiple pandas dataframes
I have a bunch of separate excel sheets with counts of the number of people attending events in a set of rooms during time increments for each day of the week. Kind of like:
ROOM 101:
| TIME WINDOW | MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY |
|---|---|---|---|---|---|
| 8:00 AM - 12:00 PM | 0 | 0 | 0 | 5 | 0 |
| 12:00 PM - 4:00 PM | 5 | 5 | 20 | 5 | 0 |
| 4:00 PM - 8:00 PM | 5 | 0 | 20 | 5 | 0 |
ROOM 102:
| TIME WINDOW | MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY |
|---|---|---|---|---|---|
| 8:00 AM - 12:00 PM | 0 | 0 | 10 | 5 | 0 |
| 12:00 PM - 4:00 PM | 8 | 0 | 10 | 5 | 0 |
| 4:00 PM - 8:00 PM | 8 | 5 | 10 | 5 | 0 |
I wish to produce a new dataframe that reports for each day and time window the number of instances where the count was 0. Which I would then use to create a seaborn heatmap. In this example:
| TIME WINDOW | MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY |
|---|---|---|---|---|---|
| 8:00 AM - 12:00 PM | 2 | 2 | 1 | 0 | 2 |
| 12:00 PM - 4:00 PM | 0 | 1 | 0 | 0 | 2 |
| 4:00 PM - 8:00 PM | 0 | 1 | 0 | 0 | 2 |
After looping through the files and reading them in as dataframes, how could I do this?
Also, is there a way to populate lists of the zero rooms for each cell (day and time window). E.g. Friday 8:00 AM = [101,102]
Solution 1:[1]
You could use eq + astype(int); then add the DataFrames:
out = df1.set_index('TIME WINDOW').eq(0).astype(int) + df2.set_index('TIME WINDOW').eq(0).astype(int)
Output:
MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY
TIME WINDOW
8:00 AM - 12:00 PM 2 2 1 0 2
12:00 PM - 4:00 PM 0 1 0 0 2
4:00 PM - 8:00 PM 0 1 0 0 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
