Identification of cell-type-specific, transcriptionally active transposable elements using long-read RNA-sequencing data-based comprehensive annotation

利用基于长读长RNA测序数据的综合注释方法,鉴定细胞类型特异性、转录活性转座元件

阅读:1

Abstract

BACKGROUND: The biological functions of transposable element (TE)-derived transcripts during physiological development, disease development, and progression have been previously reported. However, research on locus-specific TE-derived transcript expression in various human cell types remains limited. METHODS: We processed 2596 publicly available human long-read RNA-sequencing (LR RNA-seq) datasets covering 21 organs and 71 cell lines in both healthy individuals and diseased patients with various conditions to compile this TE-derived transcript annotation. We established a pipeline for assembling transcripts containing TE sequences to measure transcriptionally active TE-derived transcripts in diverse tissues and cell types. Next, we applied our TE annotation to the Genotype-Tissue Expression (GTEx) single-cell RNA-sequencing (scRNA-seq) data from eight tissues. RESULTS: We constructed the first transcriptom6e-based TE annotation using massive amounts of human LR RNA-seq data for use as a comprehensive reference to detect locus-specific TE-derived transcripts. Our annotation showed better detection accuracy for TE-derived transcripts than the RepeatMasker and GENCODE nonTE gene annotations. This annotation enabled the identification of novel TE-derived transcripts and their isoforms. We also identified alternative transcription end sites for long noncoding genes and confirmed previously annotated TE-nonTE gene fusion transcripts. Next, we applied our TE-derived transcript annotation to public scRNA-seq data from various human tissues and identified several cell-type-specific TE-derived transcripts in a locus-specific manner. CONCLUSIONS: We generated a comprehensive, TE-derived transcript annotation using large-scale, LR RNA-seq data. Researchers can use our TE reference annotation to analyze active TE transcripts and their splicing isoforms in specific transcriptome datasets and to detect de novo TE transcripts. The discovery of cell-type-specific TE-derived transcripts may help explain mechanisms underlying the maintenance of cellular identity and provide new insights into the pathological mechanisms of various diseases.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。