Skip to content

udosreis/black-spatula-project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Black Spatula Project

Overview

The goal of this project is to evaluate whether AI models (initially OpenAI's "o1" and possibly "o1-pro") can reliably identify factual, logical, and mathematical errors in published scientific papers. We will measure:

  • Number of errors detected
  • Severity of identified errors
  • False positive rate
  • Effort required to verify AI findings

Background

This project is named after a recent high-profile paper on black plastic kitchen utensils that contained a simple but consequential math error. This mistake, which passed peer review, could have been flagged by an AI reviewer.

About

Verifying scientific papers using LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 78.1%
  • Python 21.9%