🦄 The next generation of Multi-Modal Multi-Agent framework. 🤖.
We are dedicated to developing a universal multi-modal multi-agent framework. Multi-modal agents are very powerful agents capable of understanding and generating information across various modalities—including text, images, audio, and video. These agents are designed to automatically completing complex tasks that involve multiple modals of input and output. Our Framework also aims to support multi-agent collaboration. This approach allows for a more comprehensive and nuanced understanding of complex scenarios, leading to more effective problem-solving and task completion.
-
Build, manage and deploy your AI agents.
-
Multi-modal agents, agents can interact with users using texts, audios, images, and videos.
-
Vector database and knowledge embeddings
-
UI for chatting with AI agents.
-
Multi-agent collaboration, you can create a agents company for complex tasks, such as draw comics. (Coming soon)
-
Fine-tuning and RLHF (Coming soon)
Comics Company, create a comic about Elon lands on mars.
Make sure your python version is 3.10 and cuda version is 12.2
git clone https://github.com/ZhihaoAIRobotic/MetaAgent.git
conda create -n metaagent python=3.10
conda activate metaagent
sudo apt-get update && sudo apt-get install -y portaudio19-dev
poetry install
- KokoroTTS
Installation Tutorial: https://huggingface.co/hexgrad/Kokoro-82M#usage
- ParlerTTS
poetry install
cd frontend
npm install