Stars
MCP server for browser automation using Playwright
Create and run high-performance macOS and Linux VMs on Apple Silicon, with built-in support for AI agents.
Docker Android - Run QEMU Android in a Docker! X11 Forwarding! CI/CD for Android!
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
A live stream development of RL tunning for LLM agents
The official Python SDK for Model Context Protocol servers and clients
Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
All-in-one Web Agent framework for post-training. Start building with a few clicks!
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…
Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)
Make websites accessible for AI agents
E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.
MLGym A New Framework and Benchmark for Advancing AI Research Agents
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Universal LLM Deployment Engine with ML Compilation
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.
🌎💪 BrowserGym, a Gym environment for web task automation
WebLINX is a benchmark for building web navigation agents with conversational capabilities
Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis