Skip to content
View Virgil-L's full-sized avatar

Highlights

  • Pro

Block or report Virgil-L

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
46 stars written in Python
Clear filter

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 139,974 28,074 Updated Feb 22, 2025

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…

Python 71,122 14,692 Updated May 10, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 41,453 5,087 Updated Feb 20, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 38,925 5,829 Updated Feb 23, 2025

PyTorch Tutorial for Deep Learning Researchers

Python 30,821 8,171 Updated Aug 15, 2023

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 26,310 2,014 Updated Feb 22, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,603 1,330 Updated Feb 21, 2025

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…

Python 12,365 2,991 Updated Feb 21, 2025

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 11,926 2,324 Updated Oct 30, 2023

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)

Python 9,824 1,392 Updated Jul 31, 2023

Python bindings for llama.cpp

Python 8,674 1,059 Updated Jan 29, 2025

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力

Python 7,097 1,196 Updated Aug 24, 2022

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,798 456 Updated Jan 3, 2025

Example models using DeepSpeed

Python 6,308 1,068 Updated Feb 14, 2025

中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。

Python 4,366 790 Updated Nov 21, 2023

Facilitating the design, comparison and sharing of deep text matching models.

Python 3,852 898 Updated Aug 2, 2024

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Python 3,601 531 Updated Feb 20, 2025

Using GPT to parse PDF

Python 3,250 233 Updated Aug 7, 2024

Python 中文数据结构和算法教程

Python 2,858 809 Updated Jun 16, 2024

Image to prompt with BLIP and CLIP

Python 2,773 433 Updated May 15, 2024

View model summaries in PyTorch!

Python 2,701 124 Updated Feb 10, 2025

一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda

Python 1,800 169 Updated Apr 15, 2024

Emu Series: Generative Multimodal Models from BAAI

Python 1,683 86 Updated Sep 27, 2024

Efficient Retrieval Augmentation and Generation Framework

Python 1,460 135 Updated Jan 9, 2025

pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。

Python 1,292 209 Updated Jun 18, 2022

An elegent pytorch implement of transformers

Python 1,273 163 Updated Feb 15, 2025

Recipes to train reward model for RLHF.

Python 1,186 84 Updated Feb 9, 2025

Implement Statistical Learning Methods, Li Hang the hard way. 李航《统计学习方法》一书的硬核 Python 实现

Python 1,173 283 Updated Jun 4, 2022

A Collection of BM25 Algorithms in Python

Python 1,107 93 Updated Oct 8, 2024

🚁 保险行业语料库,聊天机器人

Python 1,022 343 Updated Jul 12, 2024
Next