Using Falcon40B Instruct...and any other Open Source LLMs on GPU via HuggingFace

In this tutorial we're going to be checking out some of the biggest baddest LLMs...but running them on a GPU!

See it live and in action 📺

Create a virtual environment python -m venv gpullm
Activate it:
- Windows:.\gpullm\Scripts\activate
- Mac: source gpullm/bin/activate
Install PyTorch with CUDA Support N.B. I've included the lib in the requirements.txt file but this is latest installer as of creating this readme. pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 Download and install CUDA (i've used 11.7 for this tutorial): https://developer.nvidia.com/cuda-11-7-0-download-archive Download and install the matching cuDNN version ( v8.9.1): https://developer.nvidia.com/rdp/cudnn-archive
Clone this repo git clone https://github.com/nicknochnack/Falcon40B
Go into the directory cd Falcon40B
Startup jupyter by running jupyter lab in a terminal or command prompt
Hit Ctrl + Enter to run through the notebook!
Go back to my YouTube channel and like and subscribe 😉...no seriously...please! lol

-PyTorch Installation:main guide leveraged to handle GPU support.

-Langchain HF Pipelines:the HF Pipelines class is used in order to pass the local LLM to a chain.

👨🏾‍💻 Author: Nick Renotte
📅 Version: 1.x
📜 License: This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Falcon.ipynb		Falcon.ipynb
README.md		README.md