Making the Most of Its Short Reads: A Bioinformatics Workflow for Analysing the Short-Read-Only Data of Leishmania orientalis (Formerly Named Leishmania siamensis) Isolate PCM2 in Thailand

充分利用短读序列:用于分析泰国东方利什曼原虫(原名暹罗利什曼原虫)PCM2分离株短读序列数据的生物信息学工作流程

阅读:1

Abstract

BACKGROUND: Leishmania orientalis (formerly named Leishmania siamensis) has been neglected for years in Thailand. The genomic study of L. orientalis has gained much attention recently after the release of the first high-quality reference genome of the isolate LSCM4. The integrative approach of multiple sequencing platforms for whole-genome sequencing has proven effective at the expense of considerably expensive costs. This study presents a preliminary bioinformatic workflow including the use of multi-step de novo assembly coupled with the reference-based assembly method to produce high-quality genomic drafts from the short-read Illumina sequence data of L. orientalis isolate PCM2. RESULTS: The integrating multi-step de novo assembly by MEGAHIT and SPAdes with the reference-based method using the L. enriettii genome and salvaging the unmapped reads resulted in the 30.27 Mb genomic draft of L. orientalis isolate PCM2 with 3367 contigs and 8887 predicted genes. The results from the integrated approach showed the best integrity, coverage, and contig alignment when compared to the genome of L. orientalis isolate LSCM4 collected from the northern province of Thailand. Similar patterns of gene ratios and frequency were observed from the GO biological process annotation. Fifty GO terms were assigned to the assembled genomes, and 23 of these (accounting for 61.6% of the annotated genes) showed higher gene counts and ratios when results from our workflow were compared to those of the LSCM4 isolate. CONCLUSIONS: These results indicated that our proposed bioinformatic workflow produced an acceptable-quality genome of L. orientalis strain PCM2 for functional genomic analysis, maximising the usage of the short-read data. This workflow would give extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。