Abstract
Linseed (Linum usitatissimum L.), a member of the Linaceae family, is a versatile crop valued for its oil, fibre, nutritional and medicinal applications. Recognized as a superfood, linseed is rich in omega-3 fatty acid (~55%), lignans, high-quality proteins, dietary fibre and bioactive secondary metabolites. Previously published genome assemblies of linseed are quite fragmented and non-contiguous. In this study, we present a telomere-to-telomere (T2T) chromosome-scale genome assembly of the Indian linseed variety T397 using advanced sequencing approaches. The assembly comprises ~595 Mb of genomic sequences, with a scaffold N50 of 32.86 Mb, spanning 15 chromosomes, including 29 telomeres and 15 centromeres. A total of 34 572 protein-encoding genes were predicted with an average length of 2980.7 bp and an average of 5.0 exons per gene. Gene family analysis determines a considerable number of unique genes in linseed and its close relationship with Manihot esculenta and Ricinus communis. The higher expression of oleosin and FAD3 genes in linseed highlights their roles in oil accumulation and enrichment for omega-3 fatty acid. The metabolites found in the seeds were enriched for the biosynthesis of unsaturated fatty acids. Various potential key structural genes and transcription factors that regulate oil metabolism especially unsaturated fatty acids biosynthesis has been identified. Overall, the present study provides the potential genomic resources for accelerated genetic studies and improvement of linseed.