imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters

imputeqc:一个用于评估基因型插补质量和优化插补参数的 R 包

阅读:1

Abstract

BACKGROUND: The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In this particular software there is also an uncertainty in choosing the model parameters. fastPHASE is based on haplotype clusters, which size should be set a priori. The parameter influences the results of imputation and downstream analysis. RESULTS: We present a software toolkit imputeqc to assess the imputation quality and/or to choose the model parameters for imputation. We demonstrate the efficacy of toolkit for evaluation of imputations made with both fastPHASE and BEAGLE software for HapMap and 1000 Genomes data. The discordance of genotypes received correlated well in both methods. Using imputeqc, we also shown how to choose the optimal number of haplotype clusters and expectation-maximization cycles for fastPHASE program. The found number of haplotype clusters of 25 was further applied for hapFLK testing that revealed signatures of selection at LCT region on chromosome 2. We also demonstrated how to decrease the computational time in the case of hapFLK testing from 3 days to 20 h. CONCLUSIONS: The toolkit is implemented as an R package imputeqc and command line scripts. The code is freely available at https://github.com/inzilico/imputeqc under the MIT license.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。