'Z-score calculation on rolling window
How can I calculate z-score on 2 days rolling window with my data as below in pandas dataframe?
I also want to group by on class name.
| Date | Class | Marks |
|---|---|---|
| 01-01-2022 | A | 1700 |
| 02-01-2022 | A | 3000 |
| 03-01-2022 | A | 2624 |
| 04-01-2022 | A | 1745 |
| 05-01-2022 | A | 1789 |
| 06-01-2022 | A | 1874 |
| 01-01-2022 | B | 1965 |
| 02-01-2022 | B | 1847 |
| 03-01-2022 | B | 1849 |
| 04-01-2022 | B | 1754 |
| 05-01-2022 | B | 1598 |
| 06-01-2022 | B | 1515 |
| 01-01-2022 | C | 433 |
| 02-01-2022 | C | 350 |
| 03-01-2022 | C | 268 |
| 04-01-2022 | C | 433 |
| 05-01-2022 | C | 350 |
| 06-01-2022 | C | 268 |
I tried this.
my_data['zscore'] = (my_data['Marks']-(my_data['Marks'].rolling(2).mean()))/(my_data['Marks'].rolling(2).std())
Solution 1:[1]
Use:
def func(x):
return np.mean(x)/np.std(x)
then:
my_data['Marks'].rolling(2).apply(func)
Or:
my_data['Marks'].rolling(2).apply(lambda x: np.mean(x)/np.std(x))
Based on your comment, use:
string = """Date Class Marks
01-01-2022 A 1700
02-01-2022 A 3000
03-01-2022 A 2624
04-01-2022 A 1745
05-01-2022 A 1789
06-01-2022 A 1874
01-01-2022 B 1965
02-01-2022 B 1847
03-01-2022 B 1849
04-01-2022 B 1754
05-01-2022 B 1598
06-01-2022 B 1515
01-01-2022 C 433
02-01-2022 C 350
03-01-2022 C 268
04-01-2022 C 433
05-01-2022 C 350
06-01-2022 C 268"""
temp = [x.split(' ') for x in string.split('\n')]
df=pd.DataFrame(temp[1:], columns = temp[0])
df.groupby('Class')['Marks'].apply(lambda x: np.mean(x)/np.std(x))
Output:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

