Abstract
BACKGROUND: Diffusing capacity for carbon monoxide (Dlco) is a critical measurement for diagnosing and monitoring cardiorespiratory diseases, but its clinical utility is limited by measurement variability. Clinical guidelines lack evidence-based thresholds for distinguishing significant changes from normal variability. RESEARCH QUESTION: What is the magnitude and determinants of Dlco intersession variability in routine clinical practice, and what evidence-based thresholds can guide interpretation of serial measurements? STUDY DESIGN AND METHODS: This cohort study analyzed data from 5,069 patients with stable spirometry-defined as < 5% change in both FEV(1) and FVC-and at least 2 hemoglobin-adjusted Dlco measurements. Three variability metrics were assessed: absolute difference (mL/min/mm Hg), absolute difference in percent predicted, and relative percentage difference. RESULTS: Dlco demonstrated substantial intersession variability that exceeded conventional thresholds, with 90th percentile values of 3.5 mL/min/mm Hg (absolute), 15% (percent predicted), and 21% (relative percentage). Higher baseline Dlco, lower hemoglobin, male sex, and restrictive patterns were associated with greater absolute variability, but these factors only explained a fraction of total observed variability (R(2) = 0.114). Test-retest reliability was excellent, but significant heteroscedasticity (Spearman ρ = 0.241, P < .001) demonstrated that measurement variability increases systematically with Dlco magnitude, explaining why no single threshold performs optimally across all baseline Dlco levels. A 3-tier grading system was developed based on the 85th and 95th percentile thresholds (stable, possible change, and definite change). A novel hybrid approach using baseline Dlco-specific absolute thresholds (≤ 2.5 mL/min/mm Hg for < 50% predicted; ≤ 3.5 mL/min/mm Hg for ≥ 50% predicted) demonstrated more consistent performance across Dlco categories. INTERPRETATION: In this study, Dlco was shown to exhibits substantial intersession variability in clinical settings, exceeding commonly recognized levels. The choice of variability metric influenced how changes are interpreted, especially across different baseline Dlco values. The evidence-based thresholds and 3-tier classification system established in this study provide practical tools for distinguishing clinically significant Dlco changes from normal measurement variability.