Genome-wide association studies from spoken phenotypic descriptions: a proof of concept from maize field studies

基于口述表型描述的全基因组关联研究:玉米田间研究的概念验证

阅读:1

Abstract

We present a novel approach to genome-wide association studies (GWAS) by leveraging unstructured, spoken phenotypic descriptions to identify genomic regions associated with maize traits. Utilizing the Wisconsin Diversity panel, we collected spoken descriptions of Zea mays ssp. mays traits, converting these qualitative observations into quantitative data amenable to GWAS analysis. First, we determined that visually striking phenotypes could be detected from unstructured spoken phenotypic descriptions. Next, we developed two methods to process the same descriptions to derive the trait plant height, a well-characterized phenotypic feature in maize: (1) a semantic similarity metric that assigns a score based on the resemblance of each observation to the concept of 'tallness' and (2) a manual scoring system that categorizes and assigns values to phrases related to plant height. Our analysis successfully corroborated known genomic associations and uncovered novel candidate genes potentially linked to plant height. Some of these genes are associated with gene ontology terms that suggest a plausible involvement in determining plant stature. This proof-of-concept demonstrates the viability of spoken phenotypic descriptions in GWAS and introduces a scalable framework for incorporating unstructured language data into genetic association studies. This methodology has the potential not only to enrich the phenotypic data used in GWAS and to enhance the discovery of genetic elements linked to complex traits but also to expand the repertoire of phenotype data collection methods available for use in the field environment.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。