Abstract
The application of in-field and aerial spectroscopy to assess functional and phylogenetic variation in plants has led to novel ecological insights and supports global assessments of plant biodiversity. Understanding how plant genetic variation influences reflectance spectra will help harness this potential for biodiversity monitoring and improve understanding of why plants differ in functional responses to environmental change. Here, we use a well-resolved genetic mapping population derived from Multiparent Advanced Generation Inter-cross (MAGIC) lines of Nicotiana attenuata to associate genetic differences with differences in leaf spectra between plants in a field experiment in their natural environment. We analyzed the leaf reflectance spectra using a hand-held spectroradiometer (350-2500 nm) on 616 fully genotyped plants of N. attenuata grown in a randomized block design. We tested three approaches to conducting genome-wide association studies on spectral variants. We introduce a new hierarchical spectral clustering with parallel analysis (HSC-PA) method. This method efficiently captured the variation in our high-dimensional dataset and allowed us to discover a novel association, between a locus on chromosome 1 and the 734-1143 nm spectral range, spanning the red-edge and near-infrared regions that are sensitive to leaf structure and photosynthetic activity. This locus contains a candidate gene annotated as carbonic anhydrase, an enzyme involved in CO₂ hydration and regulation of photosynthetic efficiency, suggesting a physiological link between variation in leaf optical properties and carbon assimilation. In contrast, an approach treating single wavelengths as phenotypes identified genetic signals highly consistent with HSC-PA, but suffered from massive statistical redundancy without pinpointing significant, interpretable features. An index-based approach, which reduces complex spectra to a few dimensionless variables, detected two significant associations for ARDSI_Cw (a water-content-related index) with loci on chromosome 1 near genes annotated as a Zeta toxin domain-containing protein, and an Exocyst subunit Exo70 family protein. While these findings are biologically plausible, they represent a very narrow subset of the spectral variation captured by HSC-PA. The HSC-PA approach supports a comprehensive understanding of the genetic determinants of leaf spectral variation that is data-driven but human-interpretable and is thus a tool to discover genetic differences underlying intraspecific variation, a foundation of biodiversity.