Abstract
Chronic lymphocytic leukemia (CLL) cells may bear mutations in IGHV genes, the 2%-cutoff allowing to discriminate two subsets, unmutated (U)- or mutated (M)-CLL, with different clinical course. IGHV genes may also incorporate additional ongoing mutations, a phenomenon known as intraclonal diversification (ID). Here, through an original bioinformatic workflow for NGS data, we used the inverse Simpson Index (iSI) as diversity measure among IGHV sequences to dichotomize cases with different ID levels into ID(high) (iSI ≥ 1.2) vs. ID(low) (iSI < 1.2) both in CLL (n = 983) and in other lymphoproliferative disorders (LPD; n = 127). In CLL, ID(high) cases accounted for 14.6%, overrepresented in M-CLL (P = 0.0028), while higher percentages were documented in GC-derived LPD. In M-CLL (n = 396), ID(high) patients (n = 69) experienced longer time-to-first treatment than ID(low) patients (P = 0.015), and multivariate analyses (n = 299) confirmed ID as independent variable. IGHV gene mutations of ID(high) cases had molecular signatures indicating ongoing activity of the AID)/Polη-dependent machinery; consistently, ID(high) M-CLL expressed higher levels of AID transcripts than ID(low) M-CLL (P = 0.012). In conclusion, we propose a robust NGS protocol to quantitatively evaluate ID in CLL, demonstrating that: i) all CLL patients presented ID although at various degree; ii) high degree of ID has clinical relevance identifying a M-CLL subset with better outcome.