Jarvis Llama Inference

This repository provides two Dockerfiles for running Llama (or Llama 2) in a Docker container:

GPU Version: Uses an NVIDIA CUDA base image for GPU-accelerated inference (x86_64 + NVIDIA GPU).
CPU Version: Uses a plain Ubuntu base image for CPU-only inference, suitable for quick tests or Apple Silicon fallback.

Dockerfile.gpu

Based on nvidia/cuda:11.8.0-devel-ubuntu22.04.
Installs PyTorch with CUDA 11.8 support (torch==2.0.1+cu118, etc.).
Requires an x86_64 environment with NVIDIA drivers and the NVIDIA Container Toolkit installed.

Dockerfile.cpu

Based on ubuntu:22.04.
Installs CPU-only PyTorch wheels.
Suitable for local testing or Apple Silicon Docker (ARM), though inference will be slower.

Why We Built This

Within Jarvis, we frequently need natural-language processing and generative AI to:

Interpret user instructions (“Buy SOL if RSI < 30…”).
Generate human-like responses for trading insights or DeFi management tasks.
Rapidly prototype AI-driven features without reconfiguring GPU servers.

By bundling Llama in a Docker container, we simplify setup and ensure consistent deployments across development, staging, and production.

Where We Are Using It

Jarvis Core: Primary inference engine for prompt-based instructions, turning plain-English commands into structured actions.
DeFi Management: Automated liquidity provisioning and yield-farming instructions via Raydium or other protocols.
Internal Tools: Prototyping and testing new AI-driven components quickly and reliably.

Quick Start

Requirements:
- An x86_64 machine with an NVIDIA GPU (local or cloud).
- NVIDIA Container Toolkit installed.
Build & Run:
```
cd scripts
./build_and_run_gpu.sh
```

Test the endpoint:

 curl -X POST http://localhost:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, Jarvis!", "max_new_tokens": 30}'

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
docker		docker
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jarvis Llama Inference

Dockerfile.gpu

Dockerfile.cpu

Why We Built This

Where We Are Using It

Quick Start

About

Releases

Packages

Languages

License

jarvis-fun/jarvis-inference-service

Folders and files

Latest commit

History

Repository files navigation

Jarvis Llama Inference

Dockerfile.gpu

Dockerfile.cpu

Why We Built This

Where We Are Using It

Quick Start

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages