Abstract
PURPOSE: To evaluate the feasibility of using large multimodal generative models for feature-targeted synthetic augmentation in macular hole detection and segmentation on color fundus photographs. METHODS: We assembled an internal development set of 10 macular hole and 50 normal fundus images from open-source datasets and generated feature-targeted macular hole images using two commercial multimodal engines, Nanobanana Pro and ChatGPT-5. Six augmentation strategies were compared for binary classification with ResNet-50 and for U-Net-based segmentation. External performance was evaluated on two independent datasets, JSIEC and RFMiD. RESULTS: Nanobanana Pro and ChatGPT-5 produced visually plausible macular hole–like lesions, and Nanobanana Pro images received higher realism and training suitability scores than ChatGPT-5 images. In JSIEC, combined Nanobanana Pro plus ChatGPT-5 augmentation increased the receiver operating characteristic area under the curve (ROC-AUC) for macular hole detection from 0.78 (baseline) to 0.83, and in RFMiD from 0.79 to 0.86. However, ROC-AUC differences were not statistically significant by DeLong’s test. For segmentation, the same combined augmentation improved Dice similarity from 0.50 to 0.63 in JSIEC and from 0.47 to 0.59 in RFMiD, with statistically significant differences in paired per-image comparisons. CONCLUSION: Feature-targeted synthetic augmentation with multimodal generative models showed promising but statistically limited gains in macular hole detection and more consistent improvements in segmentation under severe data scarcity. Accordingly, the findings should be interpreted as exploratory, directionally favorable trends rather than definitive evidence of improved detection. These exploratory findings support the potential of clinician guided generative augmentation as a practical tool for rare retinal diseases, but larger studies with transparent generative backends and prospective validation are needed before clinical deployment. CLINICAL TRIAL NUMBER: Not applicable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12886-026-04740-w.