'Sanity check: mixed model in lme4 with "complex" interactions

This is a little fiddly, but I'll do my best to explain.

Background. I have repeated-measures data for 10,000 human couples across 14 measurements. I'm modelling the within-pair association for sex/age-standardised traits (e.g., BMI z-scores). I've fit the following model with random intercepts and slopes using lme4:

lmer(bmi ~ BMI + average_age + age_diff + (BMI|hhid), data = d5)

I find (as expected) that there's a lot of variance in slopes between couples. I hypothesised that absolute BMI difference might affect the within-pair association.

To investigate this, I created a new variable measuring the BMI difference between each couple at baseline (i.e., their first measurement), calculated as follows: group_by(coupleID) %>% mutate(baseline_diff = abs(first(bmi) - abs(first(BMI)))

I then dropped the first observation for each couple from the dataset and re-fit the model, adding baseline_diff as an interaction effect:

lmer(bmi ~ BMI*baseline_diff + (BMI|CoupleID), data = d5)

I found that the interaction term significantly moderated with within-pair association (the more similar couples are at baseline, the stronger their association across subsequent measurements). You can see this on the plot below:

interaction plot

However, there are two problems: 1) the baseline difference is likely to have a stronger effect on early measurements than on later ones; 2) the baseline is an arbitrary point and, in reality, trait difference at all stages (e.g., diff_time1, diff_time2, diff_time3) will continuously affect the within-pair association.

Key issue. Therefore, what I want to do is measure the ongoing effect of continuous absolute BMI difference on the within-pair association across all measurements.

I was wondering whether I could do this by creating a lagged BMI difference variable, which, for each observation, measures the previous difference, as follows: group_by(hhid) %>% mutate(lag_diff = abs(lag(bmi - lag(BMI)))). My idea was then to fit this variable as an interaction effect, like this:

lmer(bmi ~ BMI*lag_diff + (BMI|CoupleID), data = d5)

In this model, the lagged difference significantly moderates the association (once again, larger differences = weaker association).

So, 1) does this model make sense? 2) Is there a better way to achieve what I want to achieve?

I've provided an example of the data structure, including all the new variables I've created, below (sorry it's a bit brief):

CoupleID bmi BMI baseline_diff lag_diff observations
1 -0.65 -0.08 0.47 0.47 1
1 -0.49 -1.04 0.47 0.56 2
1 -0.62 0.47 0.47 0.54 3
1 -0.45 0.42 0.47 1.09 4
1 -0.48 -0.49 0.47 0.87 5

Thank you in advance for any help!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source