-
Information technology institution
- Cairo
- https://www.linkedin.com/in/mahmoud-yasser-22m/
Starred repositories
A collection of pre-trained, state-of-the-art models in the ONNX format
Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation
Repository with code examples of mlflow
Build your own layer-2 virtual switch in less than 300 lines of code
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
π A curated list of resources dedicated to Natural Language Processing (NLP)
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
real time face swap and one-click video deepfake with only a single image
Suna - Open Source Generalist AI Agent
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
π Collection of Kaggle Solutions and Ideas π
π The fast, Pythonic way to build MCP servers and clients
Learn Low Level Design (LLD) and prepare for interviews using free resources.
Comprehensive guide to learn RAG from basics to advanced.
DevOps Roadmap for 2025. with learning resources
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
A list of tools, papers and code related to Fake Audio Detection.
Code and Slides
Learn System Design concepts and prepare for interviews using free resources.
π¦ OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
The python library for real-time communication
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
This repository contains the Hugging Face Agents Course.
A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"
SGLang is a fast serving framework for large language models and vision language models.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched