Autonomous language-image generation loops converge to generic visual motifs

自主语言-图像生成循环最终趋向于通用视觉主题

阅读:1

Abstract

Autonomous AI-to-AI creative systems promise new frontiers in machine creativity, yet we show that they systematically converge toward generic outputs. We built iterative feedback loops between Stable Diffusion XL (SDXL; image generation) and Large Language and Vision Assistant (LLaVA; image description), forming autonomous text → image → text → image cycles. Across 700 trajectories with diverse prompts and 7 temperature settings over 100 iterations, all runs converged to nearly identical visuals-what we term "visual elevator music." Quantitative analysis revealed just 12 dominant motifs with commercially safe aesthetics, such as stormy lighthouses and palatial interiors. This convergence persisted across model pairs, indicating structural limits in cross-modal AI creativity. The effect mirrors human cultural transmission, where iterated learning amplifies cognitive biases, but here, diversity collapses entirely as AI loops gravitate to high-probability attractors in training data. Our findings expose hidden homogenizing tendencies in current architectures and underscore the need for anti-convergence mechanisms and sustained human-AI interplay to preserve creative diversity.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。