A review on longitudinal data analysis with random forest

Jianchang Hu, Silke Szymczak


In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistical methods, especially in the context of high-dimensional data. In this paper, we review extensions of the standard RF method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate response longitudinal data and further categorize the repeated measurements according to whether the time effect is relevant. Even though most extensions are proposed for low-dimensional data, some can be applied to high-dimensional data. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.

Original languageEnglish
JournalBriefings in bioinformatics
Issue number2
Publication statusPublished - 19.03.2023


Dive into the research topics of 'A review on longitudinal data analysis with random forest'. Together they form a unique fingerprint.

Cite this