Skip to content

A repo built for the purpose of benchmarking the performance of agents, regardless of how they are set up and how they work.

License

Notifications You must be signed in to change notification settings

waynehamadi/Auto-GPT-Benchmarks

 
 

Repository files navigation

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task Auto-GPT gpt-engineer mini-agi smol-developer
Write File tbd
Read File tbd
Search File tbd

Code

Task Auto-GPT gpt-engineer mini-agi smol-developer
Debug Simple Typo With Guidance tbd
Debug Simple Typo Without Guidance tbd
Basic Code Generation tbd
Create Simple Web Server tbd

Memory

Task Auto-GPT
Basic Memory
Remember Multiple Ids
Remember Multiple Ids With Noise
Remember Multiple Phrases With Noise

About

A repo built for the purpose of benchmarking the performance of agents, regardless of how they are set up and how they work.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%