An AI-powered chat application featuring Rancho from "3 Idiots" that generates consistent images during conversations. The system maintains visual consistency across related images and automatically generates images when contextually appropriate.
- Character-based Chat: Interactive conversations with Rancho's persona from "3 Idiots"
- Automatic Image Generation: Contextually aware image creation without explicit commands
- Image Consistency: Maintains visual consistency across related images
- Simple Web Interface: Easy-to-use chat interface
- Debug Information: Visibility into the system's decision-making process
- Python 3.8+
- HuggingFace Account and API Token
- Sufficient disk space for ML models
- CUDA-capable GPU (recommended)
- Clone the repository:
git clone ```https://github.com/devadethanr/chat-with-an-AI-character.git```
cd [repository-name]
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install fastapi uvicorn torch transformers diffusers pillow python-dotenv huggingface-hub
- Create a
.env
file in the project root:
HUGGINGFACE_HUB_TOKEN=your_token_here
.
.
├── main.py # FastAPI backend application
├── models.py # Pydantic models and dataclasses
├── memory_manager.py # Memory management and memory-related functions
├── utils.py # Utility functions and helpers
├── test_chat.html # Simple web interface
├── requirements.txt # Python dependencies
└── README.md # Project documentation
The application uses the following models:
- LLM:
mistralai/Mistral-7B-Instruct-v0.2
- Image Generation:
runwayml/stable-diffusion-v1-5
These can be configured in main.py
by modifying the model constants:
LLM_MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.2"
IMAGE_MODEL_NAME = "runwayml/stable-diffusion-v1-5"
- Start the backend server:
python main.py
The server will run on http://localhost:8000
- Open another terminal:
- Simply run
test_chat.py
- Basic conversation testing:
python test_chat.py
Creates a new chat message and generates a response.
Request body:
{
"user_input": "string",
"context": "string"
}
Response:
{
"response": "string",
"image_url": "string (optional)",
"debug_info": "object (optional)"
}
- Uses an
ImageMemoryManager
to track generated images - Maintains context and metadata for each image
- Uses keyword extraction and similarity matching for consistency
- Automatically detects need for image generation
- Uses context patterns and keywords
- Maintains conversation flow without explicit commands
- Tracks conversation history
- Maintains image generation context
- Ensures consistent visual elements across related images
- Requires significant computational resources for image generation
- Image generation may take several seconds
- Limited to the knowledge cutoff date of the language model
- Image consistency is based on text descriptions and may vary
Common issues and solutions:
- CUDA/GPU issues:
# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"
- Memory issues:
- Reduce batch sizes
- Use CPU if GPU memory is insufficient
- Clear cache periodically
- Model loading issues:
- Check internet connection
- Verify HuggingFace token
- Ensure sufficient disk space
###submitted to Bhabha AI, please feel free to pull the repo