Abstract
BACKGROUND: Plants are always exposed to a variety of stressful environments, including heat and drought stress, which severely impact the growth, development, and productivity of the plants. To overcome such challenges, plants have evolved diverse arrays of defense mechanisms. Among several defense strategies, the expression and evolution of heat stress-tolerant proteins are crucial. They protect the cellular structures, maintain cellular homeostasis, and overcome stress conditions. Although several studies have been conducted to identify the heat-and cold-stress-tolerant proteins, studies using the physicochemical properties of the proteins remain scarce. Therefore, we used melting temperature-based identification of heat and cold tolerant proteins in A. thaliana. RESULTS: The study elucidated the thermal properties of the entire Arabidopsis thaliana proteome by considering the melting temperature (Tm) and the melting temperature index (TI). In total, 48,359 protein sequences were analyzed, and the melting temperature of the individual protein was recorded in three groups (Tm < 55°C, 55-65°C, and > 65°C). The TI of the A. thaliana proteome ranged from -15.6008 (< 55°C) to 9.605 (> 65°C). At least 22,826 proteins were found in the Tm group of 55°C to 65°C, 20,640 proteins were found in the Tm group > 65°C, and only 4893 proteins were found in the Tm group < 55°C. The mediator of RNA polymerase II transcription subunit-like protein was found to possess the highest TI (9.60), while the NADH dehydrogenase 5B subunit was found to contain the lowest TI (-15.60). The amino acid composition analysis of the A. thaliana proteome revealed, the frequency of Ala, Asp, Glu, Gly, Lys, Gln, and Val increased with the increase in Tm, while the amino acids Cys, Phe, and Trp decreased with the increase in Tm of the A. thaliana proteome. The molecular mass of the A. thaliana proteome ranged from 0.149 to 611.888 kDa, and the protein in the Tm group at 55-65°C showed the highest average molecular mass. The machine learning analysis revealed an increase in the molecular mass positively correlated with the increase in the Tm of the proteins. The codon usage pattern revealed that the codon pair prefer the Tm group-specific occurrence, where ATG-ATG and CAA-CAA codon pairs were predominant. Relative synonymous codon usage of the three Tm groups revealed AGA (Arg) and CCA (Pro) were the preferred codons for the low and high Tm group DNA sequences, respectively. Codon context analysis revealed the presence of preferences for the Tm group-specific codon pairing. There was a variation in the nucleotide position of the codons in different Tm groups. An evolutionary study revealed that gene duplication was the predominant evolutionary feature, and all of the studied genes in the three Tm groups undergone duplication. Codon context analysis revealed a distinct clustering pattern in the high Tm protein group. The study underscores the role of amino acid composition, molecular mass, and codon usage in determining the thermal stability of the proteins in the A. thaliana. CONCLUSION: The study reflected the evolution of high Tm-adapting genes through gene duplication, highlighting the role of gene and genome evolution towards encoding high Tm proteins for stress resilience.