Abstract
RNA-seq, widely used for gene expression profiling, provides nucleotide level genome coverage and summary gene expression values. Generally, low-expressed genes are ignored due to their unfavorable signal-to-noise ratio, however, these genes may offer crucial information, such as detecting rare cells in bulk tissues. In this study, we applied an approach that transforms the expression levels of low-expressed genes into a robust dichotomized on/off state by leveraging similarities in transcript coverage shape. Applied to three human cancer cohorts from the Cancer Genome Atlas (TCGA), chosen based on tissue morphology and anatomic site, we identified genes, the "offonome" near the detection limit, consistently or occasionally off across samples. Genes in the offonome spectrum proved useful for supervised and unsupervised applications, including characterizing oncogenic pathways, and identifying rare populations of cells in bulk tissue. Interrogating the offonome is relevant to bulk tumor analyses like TCGA, potentially expediting gene investigation in low-input situations like single cell RNA-seq.