A Comparison of High Dimensional Variable Selection Methods with Missing Covariates in a Prostate Cancer Study

前列腺癌研究中高维变量选择方法与缺失协变量的比较

阅读:1

Abstract

Prostate cancer is the most common cancer in American men. Dozens of specific genes have been shown to be correlated to prostate cancer, to benign and non-benign cancer cases, from a biology perspective. In this paper, we apply a penalized logistic regression model with different penalty functions to select genes that contribute to benign and non-benign cases, based on the data from a prostate cancer study. The tuning parameter is determined by cross validation. In order to take into account some specific genes that have been classified as prostate cancer genes through biology research but with missing values, multiple imputation is adopted to create complete data sets. We analyze the prostate cancer data by comparing the selection results with completely observed data only, and the results with imputed data. We also conduct a simulation study to validate our proposed method.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。