Abstract
Distribution-as-response regression problems are gaining wider attention, especially within biomedical settings where observation-rich patient specific data sets are available, such as feature densities in CT scans (Petersen et al., 2021), actigraphy (Ghosal et al., 2023), and continuous glucose monitoring (Coulter et al., 2024; Matabuena et al., 2021). To accommodate the complex structure of such problems, Petersen & Müller (2019) proposed a regression framework called Fréchet regression which allows non-Euclidean responses, including distributional responses. This regression framework was further extended for variable selection by Tucker et al. (2023), and Coulter et al. (2024) developed a fast variable selection algorithm for the specific setting of univariate distributional responses equipped with the 2-Wasserstein metric (2-Wasserstein space). We present fastfrechet, an R package providing fast implementation of these Fréchet regression and variable selection methods in 2-Wasserstein space, with resampling tools for automatic variable selection. fastfrechet makes distribution-based Fréchet regression with resampling-supplemented variable selection readily available and highly scalable to large data sets, such as the UK Biobank (Doherty et al., 2017).