'Label row when value changes pandas
I need a solution for the following problem. What I have is a timestamp and a value. This value can change positive, negative or remains steady. As soon as it changes positively from one row to another or stays steady, I want to add a label in a new column. If the value continues to increase, the same label should be added to the row. As soon as the value changes negatively, a zero should be entered as label. Can anyone help me?
Input Data
df_raw = pd.DataFrame(
{
"timestamp": [
"2017-06-16 05:19:18.993",
"2017-06-16 05:19:28.993",
"2017-06-16 05:19:38.993",
"2017-06-16 05:19:48.993",
"2017-06-16 05:19:58.993",
"2017-06-16 05:25:08.993",
"2017-06-16 05:25:18.993",
"2017-06-16 07:44:28.993",
"2017-06-16 07:45:38.993",
],
"signalvalue": [0.0, 12.0, 22.0, 13.0, 0.0, 30.0, 0.0, 3.0, 6.0],
}
)
timestamp signalvalue
0 2017-06-16 05:19:18.993 0.0
1 2017-06-16 05:19:28.993 12.0
2 2017-06-16 05:19:38.993 22.0
3 2017-06-16 05:19:48.993 13.0
4 2017-06-16 05:19:58.993 0.0
5 2017-06-16 05:25:08.993 30.0
6 2017-06-16 05:25:18.993 0.0
7 2017-06-16 07:44:28.993 3.0
8 2017-06-16 07:45:38.993 6.0
Desired Output
timestamp signalvalue label
0 2017-06-16 05:19:18.993 0.0 0
1 2017-06-16 05:19:28.993 12.0 1
2 2017-06-16 05:19:38.993 22.0 1
3 2017-06-16 05:19:48.993 13.0 0
4 2017-06-16 05:19:58.993 0.0 0
5 2017-06-16 05:25:08.993 30.0 2
6 2017-06-16 05:25:18.993 0.0 0
7 2017-06-16 07:44:28.993 3.0 3
8 2017-06-16 07:45:38.993 6.0 3
Solution 1:[1]
You can do it with the following function:
def increment_method_1(df,name):
Results=[]
last_result=0
prev_val=0
for val in df[name].values:
if val==0 or (val>0 and prev_val>=val):
Results.append(0)
elif prev_val<val and prev_val!=0:
Results.append(last_result)
elif prev_val<val and prev_val==0:
last_result+=1
Results.append(last_result)
else:
print(prev_val,val,last_result)
print("Unexpected condition")
prev_val=val
return Results
Solution 2:[2]
I am assuming you are expecting the output like the following code snippet.
import pandas as pd
import numpy as np
df_raw = pd.DataFrame(
{
"timestamp": [
"2017-06-16 05:19:18.993",
"2017-06-16 05:19:28.993",
"2017-06-16 05:19:38.993",
"2017-06-16 05:19:48.993",
"2017-06-16 05:19:58.993",
"2017-06-16 05:25:08.993",
"2017-06-16 05:25:18.993",
"2017-06-16 07:44:28.993",
"2017-06-16 07:45:38.993",
],
"signalvalue": [0.0, 12.0, 22.0, 13.0, 0.0, 30.0, 0.0, 3.0, 6.0],
}
)
modified = np.zeros((len(df_raw),)).astype(int)
positive = 0
for i in range(1, len(df_raw)):
if df_raw["signalvalue"][i] > df_raw["signalvalue"][i - 1]:
if modified[i - 1] == 0:
positive += 1
modified[i] = positive
else:
modified[i] = positive
df_raw['label'] = modified
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Hakan Akgün |
| Solution 2 |

