'How to calculate interdaily stability for signal in Python?

I'm trying to calculate interdaily stability as a feature for machine learning classification in Python. My data is for multiple days - I'm using this dataset (sample CSV). This data is sampled with minute frequency, i.e. we have 60 measurements per hour. The formula is: enter image description here

So my approach is:

def interdaily_stability(df: pd.DataFrame) -> float:
    X_mean = df["activity"].mean()

    hourly_means = df.resample("H", on="timestamp").mean()["activity"].values
    p = len(hourly_means)

    numerator = (1/p) * np.sum(np.square((hourly_means - X_mean)))
    denominator = df["activity"].var()

    return numerator / denominator
  1. Is this formula right? In particular, is p right?
  2. In particular, am I calculating this right for multiple days? If not, how can I correct this?

I know that pyActigraphy exists, but the implementation there seems incorrect for my case, e.g. there data is first resampled with .resample("1H").sum(), and then grouped by by hour, minute and second (I don't even have such resolution).

I also tried translating the code from nparACT library in R, but I don't know R good enough: nparACT implementation.



Solution 1:[1]

Looks like your implementation is correct, as i am also trying to quantify the rest-activity rhythm and using calculated score, how feasible is Modelling that time series or not.

After, exploration, could simplify Interdaily stability quantifies how consistent the activity patterns are, given over a period of time.

def interdaily_stability(df: pd.DataFrame,activity_col) -> float:

'''
Interdaily stability was centered and normalized by 
subtracting the overall mean and dividing by the overall 
standard deviation.

ref: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4542004/
'''

# Calculating Mean of the OP Signal
X_mean = df[activity_col].mean()
# Converting to hourly format and computing mean per hour
hourly_means = df.resample("H", on="timestamp").mean()[activity_col].values
# Capturing number of hours we have in our dataset
p = len(hourly_means)
# Centering and normalizing by subtracting the overall mean
numerator = (1/p) * np.sum(np.square((hourly_means - X_mean)))
# Overall standard deviation
denominator = df[activity_col].var()

return numerator / denominator

The article mentioned in the docstring is the one that simplifies the equation a but.

Actual equation looks like this:

This is a work in progress.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Atul Mishra