Abstract
MOTIVATION: Long Terminal Repeat Retrotransposons (LTR-RTs) comprise a significant portion of repetitive sequences in numerous plant species. LTR-RTs hold considerable functional significance, as they can impact gene family functionality and contribute to the formation of new genes. Investigating the quantities and activities of LTR-RTs is essential for understanding species' evolutionary dynamics and the foundational mechanisms driving genome evolution. While current softwares can predict and initially classify LTR-RTs, there is a high need for more comprehensive and efficient software to fully characterize and quantify LTR-RTs during burst events and in subsequent detailed classification and quantification, especially given the surged demands of genome annotation. RESULTS: In this study, we have developed a pipeline called Volcano to accurately classify LTR-RTs and characterize burst families in plants. To distinguish different clades of LTR-RTs, we have implemented an improved depth-first search algorithm. Volcano can also quantify LTR-RT expression using RNA-seq data. By analyzing LTR-RTs in three genomes from the Asteraceae family, we observed that larger genomes tend to contain a greater number of LTR-RTs, and our software effectively categorizes them at the clade level. AVAILABILITY AND IMPLEMENTATION: The proposed Volcano compressor can be downloaded from https://github.com/Suosihe/volcano_LTR.