Abstract
Machine learning (ML) robustness for voice disorder detection was evaluated using reverberation-augmented recordings, highlighting data quality's impact. Common vocal health assessment voice features from steady vowel samples (135 pathological, 49 controls) were used to train and test six ML classifiers. Detection performance was evaluated using clean and 2 simulated room reverberation situations (short=0.48s, long=1.82s). Support Vector Machine and k-Nearest Neighbors demonstrated reliable accuracy under short/acceptable reverberation, while Random Forest achieved the highest accuracy on clean data but lacked generalizability in augmented room conditions. Training/testing ML models on augmented data is essential to enhance their reliability in real-world voice assessments.