π» Welcome to the world of Code Intelligence!
Code Intelligence is an exciting field focused on automating code completion and generation. The ultimate objective is to develop intelligent models capable of generating code based on specific requirements. This repository serves as a comprehensive collection of the latest research and advancements in this domain.
- OctoPack: Instruction Tuning Code Large Language Models (paper, github, open-source β)
- PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback (paper, close-source β)
- CodeGen2.5: Small, but mighty (github, blog, open-source β)
- Phi-1: Textbooks Are All You Need (paper, 2023, close-source β)
- πWizardCoder: Empowering Code Large Language Models with Evol-Instruct (github, paper, 2023, open-source β)
- CodeT5+: Open Code Large Language Models for Code Understanding and Generation (github, paper, 2023, open-source β)
- StarCoder: May the source be with you! (github, paper, 2023, open-source β)
- CodeGen2: Lessons for Training LLMs on Programming and Natural Languages (github, paper, 2023, open-source β)
- Replit-code-v1-3b (github, twitter, 2023, open-source β)
- GPT4 (paper, 2023, close-source β)
- SantaCoder: don't reach for the stars! (github, paper, 2022, open-source β)
- CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X (github, paper, 2022, open-source β)
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis (github, paper, 2022, open-source β)
- Codex: Evaluating Large Language Models Trained on Code (2021, close-source β)
- Tuning Models of Code with Compiler-Generated Reinforcement Learning Feedback (paper, 2023)
- CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (github, paper, 2022)
- Demystifying GPT Self-Repair for Code Generation (paper, 2023)
- Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation (paper, 2023)
- Teaching Large Language Models to Self-Debug (paper, 2023)
- CodeT: Code Generation with Generated Tests (github, paper, 2022)
- APPS (Execution-Based)
- HumanEval: Python Code Completion (Execution-Based)
- MBPP: Python Code Completion (Execution-Based)
- MultiPL-E: Multi-language Code Completion (Execution-Based)
- HumanEval-Plus: Same code problems as HumanEval, but contain much more test cases. (Execution-Based)
- HumanEvalPack Extend HumanEval to Bugfix and Code Explain. (Execution-Based)
- CodeXGLUE (Text-Code) A Large Benchmark for Code Generation. (BLEU-Based)
- Concode. Java Code Completion. (BLEU-Based)
- ClassEval: Python Class-level Code Completion. (Execution-Based)
- DS-1000: Python data science code completion and insertion. (Execution-Based)
- CoNaLA: Statement-level Python Code Generation. (BLEU-Based)
- bigcode/Stack: The pre-training data of starcoder.
- CodeAlpaca: 20K instruction-following data generated by text-davinci-003.
- LeetCode-Solution-Python: Solutions and explanations for most of the leetcode problems.
Model | HumanEval Pass@1 |
---|---|
π w/o SFT π | |
CodeGen-16B-Multi | 18.3 |
CodeGen-16B-Mono | 29.3 |
CodeGen2.5-7B-Multi | 28.4 |
CodeGen2.5-7B-Mono | 33.4 |
CodeGeeX-13B | 22.9 |
Replit-code-v1-3B | 17.1 |
LLaMA-13B | 15.8 |
LLaMA-33B | 21.7 |
LLaMA-65B | 23.7 |
StarCoderBase-15B | 30.1 |
StarCoder-15B | 33.6 |
π w/ SFT π | |
InstructCodeT5+ | 35.0 |
CodeGen2.5-7B-instruct | 36.2 |
OctoCoder-15B | 45.8 |
WizardLM-30B 1.0 | 37.8 |
π WizardCoder-15B 1.0 | 57.3 |