Enhanced protein isoform characterization through long-read proteogenomics

通过长读蛋白质组学增强蛋白质异构体的表征

阅读：11

作者：Rachel M Miller, Ben T Jordan, Madison M Mehlferber, Erin D Jeffery, Christina Chatzipantsiou, Simi Kaur, Robert J Millikin, Yunxiang Dai, Simone Tiberi, Peter J Castaldi, Michael R Shortreed, Chance John Luckey, Ana Conesa, Lloyd M Smith, Anne Deslattes Mays, Gloria M Sheynkman

期刊：

Genome Biology

影响因子：

10.100

时间：

2022

起止号：

2022 Mar 3;23(1):69.

doi：

10.1186/s13059-022-02624-y

研究方向：

信号转导

Background

The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms.

Conclusions

Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.

Results

We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. Conclusions: Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.

Enhanced protein isoform characterization through long-read proteogenomics

通过长读蛋白质组学增强蛋白质异构体的表征

Background

Conclusions

Results

特别声明