Abstract
To enable fast and sensitive fusion detection critical for clinical oncology testing, we developed Fuzzion2, a pattern-matching program for detecting targeted gene fusions that employs an index of frequency minimizers and fuzzy matching to accommodate sequence variations. Running against 21,736 reference patterns representing chimeric fusions or internal tandem duplications, Fuzzion2 can analyze an unmapped RNA sequencing (RNA-seq) sample in minutes, at a sensitivity exceeding state-of-the art de novo fusion detection methods as demonstrated by dilution experiments. A comprehensive analysis on 23,478 RNA-seq samples from pediatric cancer, adult cancer, and normal tissues showed cancer type specificity for non-kinase fusions after accounting for multi-tissue recurrences caused by readthrough transcription, germline structural variations, index hopping, and circular RNA expression. Application of Fuzzion2 revealed distinct landscapes of pediatric and adult cancers, and its curated fusion patterns can inform interpretation of fusions detected by other methods.