Abstract
This study aims to develop comprehensive real operational datasets from three distinct building types-a large-scale office, an auditorium, and a hospital-focusing on Air Handling Units (AHUs) equipped with Constant Air Volume (CAV) systems for Automated Fault Detection and Diagnosis (AFDD). Although a consistent methodological framework was followed, data collection and preparation processes were specifically adapted to each building's unique operational characteristics. Key procedures included: (1) customized raw data collection based on individual building requirements; (2) thorough identification and removal of missing or duplicated data points; (3) systematic annotation of operational conditions and fault categories; and (4) strategic division of datasets into training, validation, and test subsets tailored to each building's specific data features. The resulting datasets enable researchers and developers to refine and advance machine learning and diagnostic models specifically designed for AFDD within AHU systems. Facility operators can then seamlessly integrate these validated AFDD models into existing management systems, facilitating efficient automated fault detection and ensuring optimal performance and reliability.