Despite providing highly accurate results, the short reads generated by second generation sequencing have major limitations in mapping complex genomic regions. Longer reads can resolve these issues and additionally phase distant variants. The third generation sequencing platform ONT currently achieves the longest sequencing reads but falls short in sequencing accuracy. Additionally, deriving phased haplotypes from amplicon-based NGS data remains a complex and time-consuming task that requires extensive bioinformatic expertise. We constructed an integrative, open-access modular data-analysis framework that allows for automated processing of high-throughput sequencing data from both second (Illumina) and third generation (ONT) sequencing platforms, combining the strengths of both technologies. Variant information is automatically evaluated and color-coded for discrepancies. Haplotypes are listed by frequency. All parts of the framework can be used independently. The framework's performance was validated using synthetic and tested with real-life data by analyzing partly homologous FUT1/2/3 sequencing data from 400 blood donors.
Merging High-Throughput, Amplicon-Based Second and Third Generation Sequencing Data: An Integrative and Modular Data Analysis Framework for Haplotype Prediction and Output Evaluation.
融合高通量、基于扩增子的第二代和第三代测序数据:用于单倍型预测和输出评估的集成模块化数据分析框架
阅读:10
作者:Mink Sylvia, Attenberger Christian, Busch Yannik, Kiefer Johanna, Peter Wolfgang, Cadamuro Janne, Steiert Tim A, Franke Andre, Gassner Christoph
| 期刊: | International Journal of Molecular Sciences | 影响因子: | 4.900 |
| 时间: | 2025 | 起止号: | 2025 Apr 7; 26(7):3443 |
| doi: | 10.3390/ijms26073443 | 研究方向: | 其它 |
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
