In this tutorial we're going to be checking out some of the biggest baddest LLMs...but running them on a GPU!
- Create a virtual environment
python -m venv gpullm
- Activate it:
- Windows:
.\gpullm\Scripts\activate
- Mac:
source gpullm/bin/activate
- Windows:
- Install PyTorch with CUDA Support
N.B. I've included the lib in the requirements.txt file but this is latest installer as of creating this readme.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
Download and install CUDA (i've used 11.7 for this tutorial): https://developer.nvidia.com/cuda-11-7-0-download-archive Download and install the matching cuDNN version ( v8.9.1): https://developer.nvidia.com/rdp/cudnn-archive - Clone this repo
git clone https://github.com/nicknochnack/Falcon40B
- Go into the directory
cd Falcon40B
- Startup jupyter by running
jupyter lab
in a terminal or command prompt - Hit
Ctrl + Enter
to run through the notebook! - Go back to my YouTube channel and like and subscribe 😉...no seriously...please! lol
-PyTorch Installation:main guide leveraged to handle GPU support.
-Langchain HF Pipelines:the HF Pipelines class is used in order to pass the local LLM to a chain.
-Falcon 40B Instruct Model Card:check out the model details.
👨🏾💻 Author: Nick Renotte
📅 Version: 1.x
📜 License: This project is licensed under the MIT License