Abstract
Unraveling the genetic regulatory networks that underlie diseases is essential for comprehending the intricate mechanisms of these conditions. While various computational strategies were developed, the approaches in the existing studies concerning network-based prediction and classification are based on the pre-estimated gene networks. However, the gene network that is pre-estimated fails to yield biologically meaningful explanations for classifying cell lines into particular clinical states. The reason for this limitation is the lack of inclusion of any information about the clinical status of cell lines during the process of network estimation. To achieve effective cell line classification and ensure the biological validity of the cell lines classification, we develop a computational strategy referred to as GRN-multiClassifier for network-based multi-class classification. The GRN-multiClassifier estimates gene network in a manner that simultaneously minimizes both the network estimation error and the negative log-likelihood function of multinomial logistic regression. That is, our strategy estimates optimized gene network to enable the multi-class classification of cell lines into specific clinical conditions. Monte Carlo simulations demonstrate the efficacy of the GRN-multiClassifier. We applied our strategy to network-based classification of acute leukemia cell lines into three distinct categories of acute leukemia. Our strategy shows outstanding performance in the classification of acute leukemia cell lines. The results for the acute leukemia marker identification are strongly supported by existing literature. The implications of our findings suggest that potential pathways involving the inhibition of ACTB and the molecular interactions between "HBA1&HBB," "HBB&HBA1," "IGKV1-5&IGHV4-31," "IGHV4-31&IGKV1-5," "HLA-DRA&CD74" and "ACTB&ACTB" could offer significant insights into the underlying mechanism of acute leukemia.