Abstract
Metaproteomics enables the functional characterization of microbiomes and host-microbe interactions by detecting and quantifying thousands of proteins. In data-dependent acquisition metaproteomics, protein quantification is commonly performed using either MS1-based area under the curve (AUC) or MS2-based peptide spectral counts (SpC). In AUC quantification, match between runs (MBR) is frequently employed to minimize data sparsity, yet its impact on metaproteomic data remains unclear. Understanding MBR's impact on metaproteomics data is especially important due to the high peak density in the MS1 mass spectra and the potential presence of not only proteins, but even entire organisms, in one sample and their absence in the other, which would complicate accurate feature mapping and transfer. While accurate quantification is essential for deriving meaningful biological inferences from metaproteomic analyses, systematic evaluations of AUC and SpC quantification in metaproteomics remain scarce. In this study, we used defined complex metaproteomic samples to perform a ground truth-based evaluation of AUC and SpC quantification and to determine the impact of MBR on AUC quantification. We found that MBR led to a substantial number of falsely identified proteins in complex samples. Protein identifications from an organism not present in the sample were wrongly transferred from other samples when MBR was used. We found that MBR-free AUC data had a wider dynamic range, higher quantitative accuracy, and more sensitive detection of abundance differences.