Abstract
When analyzing proteins in complex samples using tandem mass spectrometry of peptides generated by proteolysis, the inference of proteins can be ambiguous, even with well-validated peptides. Unresolved questions include whether to show all possible proteins vs a minimal list, what to do when proteins are inferred ambiguously, and how to quantify peptides that bridge multiple proteins, each with distinguishing evidence. Here we describe IsoformResolver, a peptide-centric protein inference algorithm that clusters proteins in two ways, one based on peptides experimentally identified from MS/MS spectra, and the other based on peptides derived from an in silico digest of the protein database. MS/MS-derived protein groups report minimal list proteins in the context of all possible proteins, without redundantly listing peptides. In silico-derived protein groups pull together functionally related proteins, providing stable identifiers. The peptide-centric grouping strategy used by IsoformResolver allows proteins to be displayed together when they share peptides in common, providing a comprehensive yet concise way to organize protein profiles. It also summarizes information on spectral counts and is especially useful for comparing results from multiple LC-MS/MS experiments. Finally, we examine the relatedness of proteins within IsoformResolver groups and compare its performance to other protein inference software.
