Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content

Abstract

Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.

期刊：	Plant Biotechnology Journal	影响因子：	10.100
时间：	2018	起止号：	2018 Oct;16(10):1756-1766.
doi：	10.1111/pbi.12912	研究方向：	信号转导

Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content

使用咖啡（阿拉比卡咖啡）基因组草图来识别与咖啡因含量相关的 SNP

Abstract

特别声明