- Los Angeles, CA
- shenglih.github.io
Stars
harshakokel / PlanBench
Forked from karthikv792/LLMs-PlanningAn extensible benchmark for evaluating large language models on planning
The official code release for Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Intuitive, type-safe expression quotations for Lean 4.
tpgh24 / ag4masses
Forked from google-deepmind/alphageometryMaking Google Deepmind's AlphaGeometry accessible to the Masses
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
An extensible benchmark for evaluating large language models on planning
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
An AI agent that solves Raven's Progressive Matrices
Official code for CVPR 2022 paper "Rethinking Visual Geo-localization for Large-Scale Applications"
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
WebChatGPT: A browser extension that augments your ChatGPT prompts with web results.
Failure archive for ChatGPT and similar models
🦜🔗 Build context-aware reasoning applications
Code for Cicero, an AI agent that plays the game of Diplomacy with open-domain natural language negotiation.