Starred repositories
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
Build real-time multimodal AI applications 🤖🎙️📹
This is a github repository of the abandonware Sequitur G2P by Bisani & Ney
A font family with a great monospaced variant for programmers.
dtinth / comic-mono-font
Forked from shannpersand/comic-shannsA legible monospace font... the very typeface you’ve been trained to recognize since childhood
Joint speech-language model - respond directly to audio!
Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!
Unicode Standard tokenization routines and orthography profile segmentation
Things you can do with the token embeddings of an LLM
Effortlessly deploy Docker Compose apps in production with zero downtime using an opinionated template 🛠️
🚀 Zero Downtime Deployment for Docker Compose
Example of a Python code that implements graceful shutdown while using asyncio, threading and multiprocessing
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Visualisation of speech data
This package contains functions for converting wav files into auditory representations and comparing them
This repository provides solutions for Google Cloud Labs, offering easy-to-understand approaches to solving problems. It is designed to help learners quickly grasp key concepts and apply practical …
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition
Fine-Tune Whisper with Transformers and PEFT
Unofficial implementation of wavenext vocoder