Comparing Large Language Models and Human Programmers for Generating Programming Code

比较大型语言模型和人类程序员在生成程序代码方面的表现

阅读:1

Abstract

The performance of seven large language models (LLMs) in generating programming code using various prompt strategies, programming languages, and task difficulties is systematically evaluated. GPT-4 substantially outperforms other LLMs, including Gemini Ultra and Claude 2. The coding performance of GPT-4 varies considerably with different prompt strategies. In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4, employing the optimal prompt strategy, outperforms 85 percent of human participants in a competitive environment, many of whom are students and professionals with moderate programming experience. GPT-4 demonstrates strong capabilities in translating code between different programming languages and in learning from past errors. The computational efficiency of the code generated by GPT-4 is comparable to that of human programmers. GPT-4 is also capable of handling broader programming tasks, including front-end design and database operations. These results suggest that GPT-4 has the potential to serve as a reliable assistant in programming code generation and software development. A programming assistant is designed based on an optimal prompt strategy to facilitate the practical use of LLMs for programming.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。