Abstract
Eye movements during reading experiments involve careful cleaning of raw data into a processed format that can then be analyzed. Through the process of cleaning and analyzing these datasets, there are many decisions that researchers make. As a result, there is a wide range of possible approaches that can be taken when analyzing datasets from reading and eye movement experiments. At present, little is known regarding the consequences of these decisions and in a worst-case scenario, specific approaches to cleaning and analyzing these datasets could "create" effects that would otherwise not be present in the datasets. Here, we addressed these issues by conducting a multiverse analysis of a range of reasonable and defensible analyses that researchers in this field might conduct. We examined a total of 1,890 different data cleaning and analytic pipelines to explore how different decisions researchers make when cleaning and analyzing their data influence perhaps the most well-known effect in eye movements and reading research: the word frequency effect. More specifically, the impact on the size of the word frequency effect during sentence reading (Lee et al. Journal of Experimental Psychology: Learning, Memory, and Cognition, 2025) was explored. The frequency effect was found to be extremely robust and present in almost all cases, but the magnitude varied substantially, with 36% of the size of the effect being due to specific choices made during data cleaning and analysis. Recommendations for further work and greater transparency in the field are set out based on our findings.