Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Textbook on reinforcement learning from human feedback
SOTA Open-Source Browser Agent for autonomously performing complex tasks on the web
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
Play with OpenAI's new Realtime API in your browser
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Make websites accessible for AI agents
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
BlackHole is a modern macOS audio loopback driver that allows applications to pass audio to other applications with zero additional latency.
Python tool for converting files and office documents to Markdown.
Examples and guides for using the Gemini API
Go bindings for the PortAudio audio I/O library
A microphone input stream for the gopxl/beep library
A little package that brings sound to any Go application. Suitable for playback and audio-processing.
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Run native macOS workloads on Kubernetes
♪ A low-level library to play sound on multiple platforms ♪
A little package that brings sound to any Go application. Suitable for playback and audio-processing.
2021年最新总结,推荐工程师合适读本,计算机科学,软件技术,创业,思想类,数学类,人物传记书籍
Robust Speech Recognition via Large-Scale Weak Supervision
The world’s first real-time, distributed, cloud-edge collaborative multimodal AI Agent Framework that simultaneously supports C/C++/Go/Python/JS/TS
📚 Freely available programming books
AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording