Abstract
BACKGROUND: Speech sound disorders are common in children and are associated with an increased risk of academic reading difficulties. The COVID-19 pandemic further highlighted the need for remote and digitalized assessment tools. In South Korea, standardized instruments such as the Urimal Test of Articulation and Phonation and Assessment of Phonology and Articulation for children are widely used but have limitations, including reliance on face-to-face evaluation, and the absence of automated scoring. OBJECTIVE: This study aimed to develop and establish the content validity of an articulation assessment tool that can overcome these limitations and be integrated into digital therapeutics (DTx). METHODS: A 3-round modified Delphi survey was conducted between July and September 2025 with 92% (23/25) of the invited experts, including 52.2% (12/23) physiatrists and 47.8% (11/23) speech-language pathologists, with a mean professional experience of 10.69 (SD 5.09) years. All participants (23/23, 100%) completed all rounds. Panelists evaluated the appropriateness of word lists, phonological environments, and scoring criteria. Quantitative analyses, including calculations of content validity ratio (CVR), content validity index (CVI), and median and IQR, were performed. Consensus thresholds were set at a CVR of ≥0.39, a CVI of ≥0.78, a median of ≥3.5, and an IQR of ≤1.0. Items were retained only when all 4 criteria were satisfied. While formal qualitative analysis was not performed, the research team internally reviewed and synthesized core keywords and themes from the experts' open-ended responses to guide the refinement of items. RESULTS: These findings were summarized into four key areas: (1) modernization of word stimuli, (2) expansion of phonological coverage, (3) refinement of scoring criteria to reduce ambiguity, and (4) enhancement of result interpretability through visualization. In round 2, a revised 35-word list was evaluated across 25 items, of which 20 (80%) met all consensus criteria. In total, 20% (5/25) of the items failed to meet at least one threshold, including phonological environment adequacy (CVR=0.48; CVI=0.74), scoring redundancy (CVR=0.13; CVI=0.57), usefulness of proportion of whole-word correctness or percentage of word proximity (CVR=0.39; CVI=0.70), contribution of mean phonological length (CVR=0.22; CVI=0.61), and usefulness of feature-based indexes (CVR=0.30; CVI=0.65; IQR 2). Items that reached consensus showed CVR values of 0.57 to 0.91, CVI values of 0.78 to 0.96, a median score of 4, and IQR values of 0 to 1. In round 3, all remaining items achieved consensus. CONCLUSIONS: This Delphi study developed a novel articulation assessment tool with robust content validity. This tool includes updated word stimuli, diverse analysis indexes, and visualization features, thereby enhancing its clinical utility and suitability for integration into artificial intelligence-based DTx. By standardizing and digitalizing articulation assessments, this tool has the potential to support personalized and accessible interventions for children with speech sound disorders.