Abstract
INTRODUCTION: Bedtime procrastination is defined as deliberately delaying sleep without any external conditions preventing sleep. One of the most frequently used scales in this field is the Bedtime Procrastination Scale (BPS). The original form of the scale consists of nine items rated on a 5-point Likert scale. The BPS is a measurement tool that has been applied to many cultures, both in the language in which it was developed and in adaptations to different languages. This study aims to examine the reliability coefficients obtained from different studies for the BPS using meta-analysis methods and to determine the average effect size for the scale. METHOD: For this purpose, studies were searched in the Scopus, Proquest, Web of Science, ScienceDirect, EBSCO, and Google Scholar databases between 2014 and 2025 using the keyword "Bedtime Procrastination Scale," and analyses were performed on 128 reliability coefficients (127 for α and 11 studies for ω). The Bonnet transformation was used to obtain the average reliability coefficient. RESULTS: Cronbach's alpha (α) was estimated at 0.855 [95% CI (0.843, 0.865)], and McDonald's omega (ω) was estimated at 0.867 [95% CI (0.834, 0.894)]. There was no publication and reporting bias found for either reliability coefficient analysis; however, the magnitude of heterogeneity suggests that moderator analyses are warranted to explain systematic variability across studies. The moderator analysis found that the variables mean age, SD age, region, and sample group were significant for the Cronbach alpha coefficient, while only the sample group variable was significant for the McDonald's omega coefficient. DISCUSSION: Overall, the findings indicate that the Bedtime Procrastination Scale demonstrates high and acceptable reliability across studies for both Cronbach's alpha and McDonald's omega. While age, region, and sample type emerged as significant moderators (for Cronbach's alpha), a substantial proportion of heterogeneity remained unexplained, indicating that reliability variability cannot be attributed to a single set of study characteristics. Although reliability was generally adequate, the observed heterogeneity and wide prediction intervals suggest that caution is warranted when the scale is used in high-stakes or critical decision-making contexts. Moreover, recommendations were made for both researchers and practitioners.