'How to retrieve pandas dataframe rows surrounding rows with a True boolean?
Suppose I have a df of the following format:
Assumed line 198 would be True for rot_mismatch, what would be the best way to retrieve the True line (easy) and the line above and below (unsolved)?
I have multiple lines with a True boolean and would like to automatically create a dataframe for closer investigation, always including the True line and its surrounding lines.
Thanks!
Edit for clarification:
exemplary input:
| id | name | Bool |
|---|---|---|
| 1 | Sta | False |
| 2 | Danny | True |
| 3 | Elle | False |
| 4 | Rob | False |
| 5 | Dan | False |
| 6 | Holger | True |
| 7 | Mat | True |
| 8 | Derrick | False |
| 9 | Lisa | False |
desired output:
| id | name | Bool |
|---|---|---|
| 1 | Sta | False |
| 2 | Danny | True |
| 3 | Elle | False |
| 5 | Dan | False |
| 6 | Holger | True |
| 7 | Mat | True |
| 8 | Derrick | False |
Solution 1:[1]
Assuming this input:
col1 rot_mismatch
0 A False
1 B True
2 C False
3 D False
4 E False
5 F False
6 G True
7 H True
to get the N rows before/after any True, you can use a rolling operation to compute a mask for boolean indexing:
N = 1
mask = (df['rot_mismatch']
.rolling(2*N+1, center=True, min_periods=1)
.max().astype(bool)
)
df2 = df.loc[mask]
output:
# N = 1
col1 rot_mismatch
0 A False
1 B True
2 C False
5 F False
6 G True
7 H True
# N = 0
col1 rot_mismatch
1 B True
6 G True
7 H True
# N = 2
col1 rot_mismatch
0 A False
1 B True
2 C False
3 D False
4 E False
5 F False
6 G True
7 H True
Solution 2:[2]
Try with shift:
>>> df[df["rot_mismatch"]|df["rot_mismatch"].shift()|df["rot_mismatch"].shift(-1)]
dep_ap_sched arr_ap_sched rot_mismatch
120 East Carmen South Nathaniel False
198 South Nathaniel East Carmen True
289 East Carmen Joneshaven False
Output for amended example:
>>> df[df["Bool"]|df["Bool"].shift()|df["Bool"].shift(-1)]
id name Bool
0 1 Sta False
1 2 Danny True
2 3 Elle False
4 5 Dan False
5 6 Holger True
6 7 Mat True
7 8 Derrick False
Solution 3:[3]
Is it what you want ?
df_true=df.loc[df['rot_mismatch']=='True',:]
df_false=df.loc[df['rot_mismatch']=='False',:]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | DataSciRookie |

