Abstract
BACKGROUND: Values that are missing because of drop-outs and other reasons are a major challenge in clinical and epidemiological studies. If not dealt with appropriately, missing values can impair the validity of study findings, distort effect estimates, and reduce statistical power. In this article, we present statistical methods of dealing with missing values in the assessment of scientific publications and compare their suitability for minimizing distortion and improving the precision of estimates. METHODS: A variety of methods of dealing with missing values are presented and discussed on the basis of publications retrieved by a selective search, as well as examples from the authors' personal experience. RESULTS: When reading a scientific article, one should ascertain how missing values are dealt with, what assumptions are made, and what methods are applied. The underlying mechanisms-missing completely at random [MCAR], missing at random [MAR], and missing not at random [MNAR]-determine the choice of suitable analytic methods. The exclusion of incomplete observations causes distortion except in the case of an MCAR mechanism. Simple imputation methods, such as mean or regression imputation, generally lead to an underestimate of variance, because they neglect uncertainties. In contrast, in the case of an MAR mechanism, multiple imputation yields reliable results, as it replaces missing values multiple times and thereby takes proper account of estimation uncertainties. CONCLUSION: Multiple imputation is an effective method to minimize distortion caused by missing values, but it requires a meticulous examination of the underlying assumptions and of the results.