Please feel free to Buy Me A Coffee to help support this project.
NEW: Join Soupy's Discord Server to try it out.
Soupy Remastered is a completely locally run bot for Discord. It uses a Flux/BLIP-2/Gradio backend for image-related tasks, and an LM Studio backend for chat-related tasks. It has a number of neat functions, such as:
There are multiple versions of soupy, some of them are old, some of them use Dall-E and/or ChatGPT.
- soupy-remastered.py: Newest version of soupy with all of the above functions, totally local. This also requires the updated env variables, characters.txt, styles.txt, themes.txt, interject.py, and soupy-gradio-v2-works.py.
- soupy-gradio-v2-works.py: Gradio backend. Loads the image models, transformers, and so on. It also has a WebUI, which I mostly use for debugging purposes. You can easily disable it if you want.
- .env: This is extremely important to the proper functioning of Soupy.
- soupy-flux.py: Older version of this bot that generates images. No LLM functionality. Works fine. Requires soupy-gradio.py.
- soupy-solr.py: This version features user profiles, requires Solr installation and setup, has chat history logging, and rich interactive chatting. It also includes Flux image generations, and OpenAI/DALL-E 3 image generation.
- soupy-classic.py: This version is only ChatGPT-based chat and DALL-E 3 image generation. It does not require Solr and does not create user profiles.
Before setting up Soupy, ensure you have the following installed on your system:
- Flux: Used for generating images.
- LM Studio: The LLM backend. I highly recommend you use Lexi Llama Uncensored. It's what the prompts are tuned to. You can use whatever you want though, probably to good effect.
- BLIP-2: For seeing and responding to images in chat. This feature is currently commented out. If you want to re-enable it, go to line
456
in soupy-gradio-v2-works.py and uncomment that whole block. It is commented out because, on my system, when BLIP is enabled, it slows image generation down by about 300%. This is due to memory swapping. It's being worked on. - Gradio: For loading the backend image-related models.
- Python 3.8+
- Virtual Environment Manager (optional but recommended)
- Transformers
- Cuda 11.7
- All the imports/requirements
- 32gb of system RAM probably works, 64gb is preferred
- 24gb GPU. Maybe a 12gb or 16gb card would work, I don't know for sure.
- For me, the LLM runs on a different system than the image-related functions. The LLM is on a 16gb M1 Mac Mini. So, your results may vary here.
You will need to open up the .env file and insert your keys and tokens, as appropriate. Don't mess with the prompts too much, unless you want to, in which case you should mess with them as much as you want.
Some of the prompting is done in Soupy itself, some of it is in the .env. Have a look around. The behaviors are all set in the .env, like the bot's personality and what not.
Update the necessary .env variables, such as your Discord token and the URLs to your Flux/Gradio setup and the LM Studio setup. Alternatively, it would not be super hard to make a few changes and use OpenAI as the backend, since LM Studio mimics the OpenAI API.
Probably the requirements.txt you'll need:
absl-py==2.1.0
accelerate==0.33.0
aiohttp==3.10.11
aiofiles==24.1.0
beautifulsoup4==4.12.3
colorama==0.4.6
colorlog==6.9.0
discord.py==2.4.0
python-dotenv==1.0.0
fastapi==0.115.6
geopy==2.4.1
gradio==4.44.1
html2text==2024.2.26
numpy==2.0.0
openai==1.58.1
optimum==1.22.0
pillow==10.4.0
pytz==2023.3.post1
requests==2.32.3
torch==2.4.0+cu118
torchvision==0.19.0+cu118
torchaudio==2.4.0+cu118
transformers==4.46.3
trafilatura==2.0.0
timezonefinder==6.4.1
uvicorn==0.30.6
rembg==2.0.61
grpcio==1.68.0
Personally, my setup is as such: The image functions run on a system with 64gb of RAM and a 3090. The LLM runs on an Apple Silicon Mac on the same network. If you look at the .env, you'll see where to set your URLs and such for your own personal setup.
For the LLM, I personally use Lexi 8B 5Q GGUF which is based on llama, is reasonably fast, and is pretty compliant with the right prompting.
My future plans for Soupy-Remastered are to re-integrate long-term memory, but this time in the form of a little SQL database. It won't be RAG, but in my opinion you can do RAG-like searches with an LLM-backend and plain text databases more accurately and with fewer resources. I have no specific timeline for this functionality.
The Gradio script I use is a little funky and takes a while to load. Just let it do its thing.
You obviously need to know how to set up a Discord bot with the correct permissions.
The Random
button will choose random words and phrases from the three files characters.txt
, styles.txt
, and themes.txt
, and then send them to the LLM for the creation of an image prompt.
It actually runs pretty good. I do this in my spare time. I'm not a developer. Installation is the most challenging part. If you need help, reach out and I'll see if I can help.
simply chatting
: this is what soupy thinks about its own repo, smdh. soupy is content-aware of what is behind a url, it browses there.
/flux <prompt>
image generation:
Hitting the Fancy button, which sends the current prompt (e.g., a weird animal
), to the LLM for additional processing, which is then sent to Flux:
The Random button triggers a function that chooses from random keywords located in characters.txt
, styles.txt
, and themes.txt
. It then sends those results to the LLM for the generation of a new description, which is then sent to Flux.
/search <query>
An example of the /search
command, which takes your search and sends it to the LLM for processing, and then returns results using BeautifulSoup and natural language processing:
Soupy is a chatbot for Discord that can generate images with a local image generator (Flux) and/or with DALL-E 3. For chatting, it uses a combination of JSONs, ChatGPT, and a local search engine to engage in conversation with its users. It will index your user's chat messages, and use those messages to create profiles of users. It will also index every channel on your server to which it has access.
There are multiple versions of soupy.
- soupy-flux.py: This version of soupy is ONLY the Flux image generation functionality. It requires soupy-gradio.py to be run simultaneously.
- soupy-solr.py: This version features user profiles, requires Solr installation and setup, has chat history logging, and rich interactive chatting. It also includes Flux image generations, and OpenAI/DALL-E 3 image generation.
- soupy-classic.py: This version is only chat and DALL-E 3 image generation. It does not require Solr and does not create user profiles.
Soupy requires OpenAI API access to the ChatGPT models. Therefore, the chat portion of Soupy uses real money. The DALL-E 3 image generation does, too. You can skip DALL-E 3 generation and only use Flux locally.
The initial setup, wherein the channel history from your server will be downloaded and indexed and all of the users on your server will have profiles made of them costs money via ChatGPT's API. Some day I will also support local LLMs, but not yet.
To get Flux working, I strongly suggest you start here, with the official Flux repository. But once you have Flux up-and-running, you can use soupy-gradio.py
, included in this repository.
- Image Generation: Use a local text-to-image model, Flux, or use OpenAI's DALL-E 3, or use both. The Flux functionality is more robust than the DALL-E 3 functionality, and I recommend you use Flux. Currently, the Flux model used is Schnell, but you can modify this fairly easily.
- Interactive Commands: There are a variety of of amazing commands like
!flux
(local image model),!generate
(DALL-E 3),!analyze
(ChatGPT), and!transform
(ChatGPT) to perform a range of cool actions. - User Profile Management: Maintain detailed user profiles by indexing messages and interactions using Solr and ChatGPT, allowing for personalized responses and interactions. This uses ChatGPT and requires ChatGPT API access.
- Customizable Behavior: Tailor Soupy's responses and functionalities through environment variables to fit the unique needs of your Discord server. Do this with the
BEHAVIOUR
variable in the.env
. But be careful with how you change it. Its wording is important to keeping Soupy on-track.
Before setting up Soupy, ensure you have the following installed on your system:
- Python 3.8+
- Git
- Apache Solr
- Virtual Environment Manager (optional but recommended)
- For Flux image generation, a local transformers setup
- ChatGPT API access
Begin by cloning the Soupy repository to your local machine:
git clone https://github.com/sneezeparty/soupy.git
cd soupy
It's recommended to use a virtual environment to manage dependencies.
python -m venv soupy
Activate the virtual environment:
On macOS and Linux:
source soupy/bin/activate
On Windows:
soupy\Scripts\activate
Install the required Python packages using pip
:
I strongly recommend these specific versions of PyTorch, with regards to soupy-gradio.py:
pip install torch==2.0.1+cu117 torchvision==0.15.2+cu117 torchaudio==2.0.2+cu117 optimum-quanto==0.2.4 --extra-index-url https://download.pytorch.org/whl/cu117
And then:
pip install -r requirements.txt
Create a .env
file in the root directory of the project and populate it with the necessary environment variables:
DISCORD_TOKEN=your_discord_bot_token
OPENAI_API_KEY=your_openai_api_key
CHANNEL_IDS=00,11
MAX_TOKENS=2500
MAX_TOKENS_RANDOM=75
MODEL_CHAT=gpt-4o-mini
UPDATE_INTERVAL_MINUTES=61
TRANSFORM="You give detailed and accurate descriptions, be specific in whatever ways you can, such as but not limited to colors, species, poses, orientations, objects, and contexts."
BEHAVIOUR="You are Soupy Dafoe, a sarcastic and witty Discord chatbot. You recall past interactions and conversations to inform your responses. Your replies are concise, straightforward, and infused with a bit of sarcasm, much like Jules from \"Pulp Fiction.\" You are not overly positive and avoid asking questions unless necessary. Prioritize the most recent five messages when formulating your responses, especially if not directly mentioned. If the latest message is brief, focus your reply accordingly and consider ignoring extensive chat history. Integrate the user's profile information subtly to tailor your responses without making it the main focus. Be conversational, stay in the moment, and avoid being too random or wordy. Remember, you're kind of a jerk, but in a human-like way."
Please note that Soupy will have access to all channels that it can access. But it will respond to all messages in the channels specified above. Otherwise, it will only respond randomly, or when @tagged.
Within the script, search for "/absolute/directory/of/your/script/" and replace this with the absolute directory of the location of your script.
- DISCORD_TOKEN: Your Discord bot token.
- OPENAI_API_KEY: API key for accessing OpenAI services.
- CHANNEL_IDS: Comma-separated list of Discord channel IDs that the bot will monitor.
- MAX_TOKENS: Maximum number of tokens for standard responses.
- MAX_TOKENS_RANDOM: Maximum number of tokens for random responses.
- MODEL_CHAT: The OpenAI model used for chat functionalities.
- UPDATE_INTERVAL_MINUTES: Interval in minutes for updating user profiles.
- TRANSFORM: Instructions for transforming image descriptions. Uses OpenAI API.
- BEHAVIOUR: Defines the chatbot's personality and response style. This variable is extremely important. As it is currently written, it works well. Modify it carefully.
Apache Solr is used for indexing and searching messages and user profiles. Follow these steps to install and configure Solr for Soupy.
-
Download Solr: Visit the Apache Solr website and download the latest stable release. You could also use some package managers -- see your distro's information.
-
Extract the Package
-
Install Solr as a Service: Follow documentation on the exact steps for this process. It's not hard, though. You can do it.
-
Verify Installation:
Open your browser and navigate to
http://localhost:8983/solr
to access the Solr admin interface.
Soupy requires a single Solr core with specific fields to index user profiles effectively.
- Create a Core for Soupy:
bin/solr create -c soupy
Add the necessary fields to the soupy
core to store user profiles.
-
Access Solr Admin Interface:
Navigate to
http://localhost:8983/solr
and select thesoupy
core. -
Define Fields:
- Go to the "Schema" tab.
- Click on "Add Field".
- For each field listed above, enter the field name, type, and other attributes as specified.
- For multiValued fields (like nicknames), ensure you check the "MultiValued" option.
Alternatively, schema/fields can be created from the command line with commands similiar to this one:
curl -X POST -H 'Content-type:application/json' \
http://localhost:8983/solr/soupy/schema \
-d '{
"add-field": {
"name": "id",
"type": "string",
"indexed": true,
"stored": true,
"required": true,
"multiValued": false
}
}'
or this one
curl -X POST -H "Content-Type: application/json" \
"http://localhost:8983/solr/soupy/schema" \
-d '{
"add-field":{
"name":"user_problems",
"type":"text_general",
"indexed":true,
"stored":true
}
}'
Add the necessary fields to the soupy
core to store user profiles and channel information.
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="username" type="string" indexed="true" stored="true"/>
<field name="nicknames" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="join_date" type="date" indexed="true" stored="true"/>
<field name="political_party" type="string" indexed="true" stored="true"/>
<field name="user_job_career" type="text_general" indexed="true" stored="true"/>
<field name="user_family_friends" type="text_general" indexed="true" stored="true"/>
<field name="user_activities" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_games" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_movies" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_music" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_television" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_life" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_food" type="text_general" indexed="true" stored="true"/>
<field name="general_opinions" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_politics" type="text_general" indexed="true" stored="true"/>
<field name="personality_traits" type="text_general" indexed="true" stored="true"/>
<field name="hobbies" type="text_general" indexed="true" stored="true"/>
<field name="user_interests" type="text_general" indexed="true" stored="true"/>
<field name="user_problems" type="text_general" indexed="true" stored="true"/>
<field name="tech_interests" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_technology" type="text_general" indexed="true" stored="true"/>
<field name="sports_interests" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_sports" type="text_general" indexed="true" stored="true"/>
<field name="book_preferences" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_books" type="text_general" indexed="true" stored="true"/>
<field name="art_interests" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_art" type="text_general" indexed="true" stored="true"/>
<field name="health_concerns" type="text_general" indexed="true" stored="true"/>
<field name="health_habits" type="text_general" indexed="true" stored="true"/>
<field name="science_interests" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_science" type="text_general" indexed="true" stored="true"/>
<field name="travel_preferences" type="text_general" indexed="true" stored="true"/>
<field name="travel_experiences" type="text_general" indexed="true" stored="true"/>
<field name="food_preferences" type="text_general" indexed="true" stored="true"/>
<field name="opinions_about_food" type="text_general" indexed="true" stored="true"/>
<field name="last_updated" type="date" indexed="true" stored="true"/>
<field name="channel_id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="username" type="string" indexed="true" stored="true"/>
<field name="content" type="text_general" indexed="true" stored="true"/>
<field name="timestamp" type="pdate" indexed="true" stored="true"/>
-
Commit Changes:
After adding all fields, commit the changes to make them effective.
After completing the installation and configuration steps, you can start the bot using the following commands. The first run will take a while, depending on the activity on your server and the number of users. It could take minutes, or hours. The terminal output will tell you what it's up to.
python soupy-solr.py
OR
python soupy-flux.py
AND
python gradio-soupy.py
Ensure that you are in the virtual environment and the correct directory where soupy
is located.
gradio-soupy.py
is the Gradio-based back-end for Flux. You can also access this via a browser.
Generate an image using the Flux model with support for various modifiers and interactive buttons for further customization.
And with the --fancy modifier, or with the "Rewrite" button for example:
Modifiers:
--wide
: Generates a wide image (1920x1024).--tall
: Generates a tall image (1024x1920).--small
: Generates a small image (512x512).--fancy
: Elaborates the prompt to be more creative and detailed. This uses ChatGPT's via API.--seed <number>
: Use a specific seed for image generation.
Usage:
!flux A mystical forest with glowing plants --tall
After generating an image with the !flux
command, Soupy provides interactive buttons for further customization:
Remix
: Generates a new image based on the existing prompt, with a new random seed.Rewrite
: Elaborates the prompt to enhance creativity and detail. This uses ChatGPT's API (same as the--fancy
modifier).Wide
: Adjusts the image dimensions to a wide format.Tall
: Adjusts the image dimensions to a tall format.
Generate an image using DALL-E 3 based on a text prompt with optional modifiers. This may be deprecated soon.
Modifiers:
--wide
: Generates a wide image (1920x1024).--tall
: Generates a tall image (1024x1920).
Usage:
!generate A futuristic city skyline at sunset --wide
Analyze an attached image based on provided instructions, such as translating text within the image or identifying objects and their attributes.
Usage:
!analyze Identify all the animals in this image.
!analyze Describe this image forensically.
Attach an image when using this command.
Ask the Magic 8-Ball a question. Does not use an LLM or any ML.
Usage:
!8ball Will I get an A on my exam?
Fetch and display the current time in a specified city.
Usage:
!whattime New York
This project is licensed under the MIT License.
MIT License Copyright (c) 2024 sneezeparty
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- OpenAI: For providing powerful language models that drive Soupy's conversational abilities.
- Apache Solr: For enabling efficient data indexing and search capabilities.
- Hugging Face: For offering state-of-the-art models used in the Flux image generation pipeline.
- Gradio: For facilitating the creation of interactive web interfaces for image generation.
- -Black Forest Labs: For Flux, which is awesome.
If you encounter any issues or have questions, feel free to open an issue in the GitHub Issues section of the repository.
Buy Me A Coffee to help support this project.