Abstract
This study focuses on the TATA-box binding, transcription initiation factor TFIID protein, (1tba) and employs a comprehensive approach combining structural, sequence and machine learning-based analysis to investigate its evolutionary relationships. By examining the TATA-box binding like protein, the study aims to identify similarities with other known protein folds, shedding light on its evolutionary relationship, and functional connections. To validate these relationships, a support vector machine (SVM) based algorithm was developed, which was complemented by another machine learning technique random forest, to ensure robust and reliable results. The integrated findings from structural, sequence, and machine learning analysis revealed several domain folds evolutionary related to TATA-box binding protein-like (1tba, B). The SVM-based method developed in this study serves as a valuable tool for identifying novel or functionally similar TATA-box binding proteins, providing deeper insights into their evolutionary and structural relationships. This work not only advances our understanding of the TATA-box binding protein family but also demonstrates the power of integrating computational and machine learning approaches in protein evolution research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-38883-z.