Starred repositories
An Open Source Machine Learning Framework for Everyone
Port of OpenAI's Whisper model in C/C++
A library for efficient similarity search and clustering of dense vectors.
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Unsupervised text tokenizer for Neural Network-based text generation.
Conversion between Traditional and Simplified Chinese
Facebook AI Research's Automatic Speech Recognition Toolkit
Transformer related optimization, including BERT, GPT
A C++ standalone library for machine learning
LightSeq: A High Performance Library for Sequence Processing and Generation
A high-quality speech analysis, manipulation and synthesis system
an open-source implementation of sequence-to-sequence based speech processing engine
Bolt is a deep learning library with high performance and heterogeneous flexibility.
Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
Automatic differentiation with weighted finite-state transducers.
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
NiuTensor is an open-source toolkit developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. It provides tensor utilities to create and train neural networks.
A fast parallel implementation of RNN Transducer.
Fast and customizable text tokenization library with BPE and SentencePiece support