Prediction of BRAF V600E variant from cancer gene expression data

基于癌症基因表达数据预测BRAF V600E变异

阅读:1

Abstract

BACKGROUND: BRAF inhibitors have been approved for the treatment of melanoma, non-small cell lung cancer, and colon cancer. Real-time polymerase chain reaction or next-generation sequencing were clinically used for BRAF variant detection to select who responds to BRAF inhibitors. The prediction of BRAF variants using gene expression data might be an alternative test when the direct variant sequencing test is not feasible. In this study, we built a prediction model to detect BRAF V600 variants with mRNA gene expression data in various cancer types. METHODS: We adopted a penalized logistic regression for the BRAF V600E variants prediction model. Ten times bootstrap resampling was done with a combined target variable and cancer type stratification. Data preprocessing included knnimputation for missing value imputation, YeoJohnson transformation for skewness correction, center, and scale for standardization, synthetic minority over-sampling technique for class imbalance. Hyperparameter optimization with a grid search was undertaken for model selection in terms of area under the precision-recall. RESULTS: The area under the curve of the receiver operating characteristic curve on the test set was 0.98 in thyroid carcinoma, 0.90 in colon adenocarcinoma, and 0.85 in cutaneous melanoma. The area under the precision-recall of the test set was 0.98 in thyroid carcinoma, 0.71 in colon adenocarcinoma, and 0.65 in cutaneous melanoma. CONCLUSIONS: Our penalized logistic regression model can predict BRAF V600E variants with good performance in thyroid carcinoma, cutaneous melanoma, and colon adenocarcinoma.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。