Skip to content
/ picoGPT Public
forked from jaymody/picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

License

Notifications You must be signed in to change notification settings

ramiil/picoGPT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PicoGPT, orgignal code by jaymody

You've seen openai/gpt-2.

You've seen karpathy/minGPT.

You've even seen karpathy/nanoGPT!

But have you seen picoGPT??!?

picoGPT is an unnecessarily tiny and minimal implementation of GPT-2 in plain NumPy. The entire forward pass code is 40 lines of code. I wrote a related blog post for picoGPT.

picoGPT features:

  • Fast? ❌ Nah, picoGPT is megaSLOW 🐌
  • Training code? ❌ Error, 4️⃣0️⃣4️⃣ not found
  • Batch inference? Sort of ✅
  • top-p sampling? Sort of ✅
  • Readable? gpt2.py
  • Smol??? ✅✅✅✅✅✅ YESS!!! TEENIE TINY in fact 🤏

A quick breakdown of each of the files:

  • encoder.py contains the code for OpenAI's BPE Tokenizer, taken straight from their gpt-2 repo.
  • utils.py contains the code to download and load the GPT-2 model weights, tokenizer, and hyper-parameters.
  • gpt2.py contains the actual GPT model and generation code which we can run as a python script.
  • gpt2_pico.py is the same as gpt2.py, but in even fewer lines of code. Why? Because why not 😎👍.

Dependencies

pip install -r requirements.txt

Tested on Python 3.9.10.

Usage

python gpt2.py
> "Alan Turing theorized that computers would one day become

Which generates

so intelligent that they would be able to think for themselves, and he predicted that computers would one day be able to simulate human thought.

You can also control the model size (one of ["124M", "355M", "774M", "1558M"]).

About

An unnecessarily tiny implementation of GPT-2 in NumPy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%