Stars
Retrieval and Retrieval-augmented LLMs
使用 pinyin-data 和 phrase-pinyin-data 中的拼音数据文件覆盖 pypinyin 中的内置拼音数据
基于语言学本体构建,全面覆盖汉语多音字、音变等现象的高效中文TTS数据集。A linguistically grounded and comprehensive Chinese TTS dataset, efficiently covering Chinese polyphonic characters, phonological changes, and more.
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A generative speech model for daily dialogue.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Ongoing research training transformer models at scale
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL…
李宏毅2021/2022/2023春季机器学习课程课件及作业
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
LangChain 的中文入门教程