DeepSeek-R1 Local Inference This project provides a simple setup to run the DeepSeek-R1 model locally on your machine. It supports both GPU (with CUDA) and CPU environments.
Table of Contents Prerequisites
Installation
Running the Model
On GPU
On CPU
Performance Considerations
Troubleshooting
License
Prerequisites Before running the model, ensure you have the following installed:
Python 3.8 or higher
pip (Python package manager)
Git (optional, for cloning the repository)
Installation Clone the repository (optional):
bash Copy git clone https://github.com/your-username/your-repo-name.git cd your-repo-name Create a virtual environment (recommended):
bash
Copy
python -m venv venv
source venv/bin/activate # On Windows, use venv\Scripts\activate
Install the required dependencies:
bash Copy pip install torch transformers If you have a GPU and want to use CUDA, install the GPU version of PyTorch:
bash Copy pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 If you only have a CPU, install the CPU-only version of PyTorch:
bash Copy pip install torch --index-url https://download.pytorch.org/whl/cpu Running the Model On GPU If your machine has a GPU with CUDA support, the model will automatically use the GPU for faster inference.
Run the inference script:
bash Copy python run_deepseek_r1.py Example output:
Copy Loading model... Model loaded successfully. Generating response... Response: "I am DeepSeek-R1, a large language model designed to assist with various tasks." On CPU If your machine does not have a GPU, the model will run on the CPU. Note that this will be slower compared to GPU inference.
Run the inference script:
bash Copy python run_deepseek_r1.py Example output:
Copy Loading model... Model loaded successfully. Generating response... Response: "I am DeepSeek-R1, a large language model designed to assist with various tasks." Performance Considerations GPU: For the best performance, use a machine with a CUDA-compatible GPU. Ensure you have the correct version of PyTorch with CUDA support installed.
CPU: Running the model on a CPU is possible but will be significantly slower, especially for large models like DeepSeek-R1. Ensure your machine has sufficient RAM (at least 16GB recommended).
Troubleshooting
- CUDA Not Available If you have a GPU but PyTorch is not detecting it, ensure you have installed the correct version of PyTorch with CUDA support. Verify CUDA availability:
python Copy import torch print(torch.cuda.is_available()) # Should return True 2. Out of Memory If you encounter out-of-memory errors, try reducing the model's batch size or using a smaller model.
- Slow Performance on CPU If the model is running too slowly on your CPU, consider upgrading your hardware or using a smaller model.
License This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments Hugging Face for the transformers library.
DeepSeek for the DeepSeek-R1 model.