Abstract
The low quantitative accuracy of conventional small noncoding RNA sequencing (sncRNA-seq) methods due to extensive ligation bias commonly limits functional investigation of microRNAs (miRNAs) and PIWI-interacting RNAs (piRNAs). Here, we develop 4NBoost, a single-tube sncRNA-seq protocol designed to minimize bias in the estimated absolute quantification of miRNA and piRNA transcripts through the incorporation of quantitative exogenous RNA spike-ins. With 4NBoost, we profile sncRNA expression across 20 murine tissues, 18 macaque tissues, and 24 widely used cell lines, as well as 4 Arabidopsis tissues, to establish a comprehensive quantitative reference atlas. Compared with existing small RNA databases, our data reveal substantial biases in miRNA abundance, strand selection, and tissue-specific expression at both individual and family levels. To further extend its utility, we employ machine learning to model and correct biases in conventional datasets, effectively recovering ground truth transcript abundances. All 4NBoost data and the accompanying bias-correction model are freely available via SmRNAQuant ( http://wulg-lab.sibcb.ac.cn/SmRNAQuant/ ), a web-based repository for exploring sncRNA expression. Together, the 4NBoost, bias-correction model, and SmRNAQuant provide powerful resources to advance sncRNA research.
