Abstract
Screening tests are crucial for detecting diseases at preclinical stages when timely intervention can prevent progression to more severe conditions. Advances in technology have facilitated developing screening tests based on novel markers, but evaluating their performance using biospecimens from large cohorts remains logistically complex and financially demanding. Two-phase designs offer a cost-effective solution by allowing inference when expensive marker measurements are collected on only a carefully selected subsample. While traditional two-phase designs have primarily targeted estimation of marker-outcome associations, they can effectively be extended to evaluate the clinical performance of a test, including estimation of positive predictive value (the risk in test positives) and complementary negative predictive value (the risk in test negatives). We propose a novel two-phase design for efficiently evaluating the risk-stratification utility of screening tests in distinguishing between high- and low-risk individuals for both current and future disease. Designed for use with biospecimens collected from screening studies, our methodology accommodates cohorts that include both pre-existing cases at an initial screening visit and new cases identified during follow-up. We demonstrate the efficiency gains of our proposed design compared to other subsampling schemes through simulation and illustrate its application in a motivating study evaluating the p16/ki-67 dual-stain test for managing human papillomavirus-positive women in cervical cancer screening. Data and stored biospecimen samples from Kaiser Permanente Northern California are used in this analysis.