Abstract
Workers frequently struggle to acquire, maintain, and use personal protective equipment (PPE) during infectious disease outbreaks. Strategic PPE distribution, guidance, and interventions can help address these challenges, but the effectiveness of these measures depends on timely characterization of how these challenges manifest across the U.S. workforce-data which no U.S. public health surveillance system currently provides. This article describes a mechanism of generating such data by using a machine learning model to detect various PPE concerns in workplace safety complaints submitted to the U.S. Occupational Safety and Health Administration (OSHA). A publicly available dataset of 78,770 OSHA complaints received during the COVID-19 pandemic was used to assess the feasibility of this approach. Results demonstrate that these OSHA complaints contained a substantial variety and number of PPE concerns, and that a machine learning model trained on these data was capable of detecting three types of PPE concerns with at least 90% precision and 90% recall: unavailable or inaccessible PPE, lack of PPE use among workers, and inadequate enforcement of PPE use. Furthermore, analyses of ML-facilitated detections were shown to elucidate national and industry-specific trends in worker PPE concerns. Although further development is needed to accurately detect a broader set of PPE concerns, the results of this study suggest that machine learning can help efficiently repurpose OSHA complaints to generate insightful real-time data on worker PPE concerns during future outbreaks.