Abstract
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) provides quantitative metrics (e.g. K(trans), v(e)) via pharmacokinetic models. We tested inter-algorithm variability in these quantitative metrics with 11 published DCE-MRI algorithms, all implementing Tofts-Kermode or extended Tofts pharmacokinetic models. Digital reference objects (DROs) with known K(trans) and v(e) values were used to assess performance at varying noise levels. Additionally, DCE-MRI data from 15 head and neck squamous cell carcinoma patients over 3 time-points during chemoradiotherapy were used to ascertain K(trans) and v(e) kinetic trends across algorithms. Algorithms performed well (less than 3% average error) when no noise was present in the DRO. With noise, 87% of K(trans) and 84% of v(e) algorithm-DRO combinations were generally in the correct order. Low Krippendorff's alpha values showed that algorithms could not consistently classify patients as above or below the median for a given algorithm at each time point or for differences in values between time points. A majority of the algorithms produced a significant Spearman correlation in v(e) of the primary gross tumor volume with time. Algorithmic differences in K(trans) and v(e) values over time indicate limitations in combining/comparing data from distinct DCE-MRI model implementations. Careful cross-algorithm quality-assurance must be utilized as DCE-MRI results may not be interpretable using differing software.