Abstract
Medical research often involves the study of composite endpoints that combine multiple clinical events to assess the efficacy of treatments. When constructing composite endpoints, it is a common practice to analyze the time to the first event. However, this approach overlooks outcomes that occur after the first event, resulting in information loss. Furthermore, the terminal event can not only be of interest but also, be a competing risk for other types of outcomes. While existing semi-parametric regression models can be used to analyze both fatal (terminal) and non-fatal composite events, potential nonlinear covariate effects on the logarithm of the rate function have not been addressed. To address this important issue, we introduce random forests for composite endpoints (Rforce) consisting of non-fatal composite events and terminal events. Rforce utilizes generalized estimating equations to build trees and handles the dependent censoring due to the terminal events with the concept of pseudo-at-risk duration. Simulation studies and real data analysis are conducted to demonstrate the performance of Rforce.