Using longitudinal studies for modelling change over time: an analyst's perspective

In this blog, we reflect on useful elements of longitudinal cohort studies and potential risks to analyses.

Analysing longitudinal cohort datasets is a promising way to gain insights into society. At the National Centre for Social Research, we have been leading a Wellcome Mental Health Data Prize-funded project entitled ‘Connect’, where we study the role of social connection on the development of depression and anxiety in young people using longitudinal cohort study data. In this blog, we reflect on useful elements of longitudinal cohort studies and potential risks to analyses.

A key element of a longitudinal cohort study is the element of time; that is, we can follow the same group of people over a longer period, allowing for insight into change in individuals and groups over time. In Connect, we are looking specifically at risk over time using growth curve modelling to understand whether the risk of developing depression or anxiety changes over time pending on different social connections a young person has or experiences. With growth curve modelling, one can not only identify who is at most risk, but also when the risk is increased. This can be informative for the timing of interventions or strategies that aim to prevent a mental health condition.

Connect aims to visualise the different trajectories for individuals’ social connections and their circumstances, allowing people to create scenarios to help understand when someone’s risk may be increased, or when it might be lowered. For example, someone might be experiencing a negative life event at age four but have strong social support at the same timepoint from a family member. In this case we want to study if social support from a family member at age four could help change the trajectory of depression.

While growth curve modelling is a relevant addition to other analytical methods to study risk and protective factors for depression and anxiety, there are elements that need consideration when modelling the data. Below we explore two of the key considerations we had when studying social connection for depression and anxiety.

The first key consideration is longitudinal attrition, that is, the probability of people leaving the study over time because they cannot or do not want to take part anymore. We know that attrition is not the same for all sample members; for example, men and people from lower socio-economic status are more likely to withdraw from a study. This means that the participants you lose in a study are different from the participants you retain, and that this difference increases over time.

Survey weights can help to address some of the bias arising from data with high levels of attrition, making the sample used in the analysis more similar to the original study sample. However, weights do not solve the issue entirely: they are based on what we know of the people who have left the study, but there is much about those people that remains unknown.

Knowing that there is longitudinal attrition, it becomes even more important to consider missingness in predictor variables and whether these increase missingness further. Missingness in predictor data is often happens when participants were expected to provide an answer to a question, but they did not. It can also happen when the question was not asked to a participant, by design, or by mistake. In Connect, we carefully reviewed nearly 8000 variables that were of interest as affecting or measuring social connection. We explored missingness and their conceptual contribution to come to a reduced set of nearly 500 variables that are important for analysing social connection and mental health.

The second key consideration is the changing nature of the concept we’re studying. In our case, this is social connection: social relationships change over time, and the areas in which young people have social connections change. For example, only when a child goes to school does it become relevant to ask about school connection. To get an oversight of all the different ways in which social connection was measured and when, we created a social connection taxonomy. Together with subject matter experts, people with lived experience and a careful review of missing data, we then came to a shorter and more succinct list of variables we wanted to model.

To conclude, longitudinal data analysis and especially growth curve modelling can be a good way to gather insights into how risk trajectories change over time. There are key considerations, though, that are important to consider when conducting and interpreting the analyses, including missing data, selection of predictors and creating a theoretical understanding of the concept being studied. We hope that with our Connect project, we contribute to some of the considerations that researchers face when modelling social connection and mental health over time.

Connect is a Wellcome-funded Mental Health Data Prize project that allows researchers to get an insight into the role that different social connection exposures can have on the development of mental health over time. Visit the Connect website to learn more.