Stars
A High-efficiency Open-source Toolkit for Table-to-Latex Task
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Have a natural voice conversation with an LLM
meoww-bot / glider
Forked from nadoo/gliderglider is a forward proxy with multiple protocols support, and also a dns/dhcp server with ipset management features(like dnsmasq).
glider is a forward proxy with multiple protocols support, and also a dns/dhcp server with ipset management features(like dnsmasq).
使用Glider将节点转换成爬虫代理池,每秒切换IP,本项目包含使用教程,并提供将clash订阅转换为glider所支持的格式
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
PDF scientific paper translation and bilingual comparison based on font rules and deep learning, preserving formula and figure layout
Official release of FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model (ACMMM2024)
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Tools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
reveal.js on steroids! Get beautiful reveal.js presentations from any Markdown file
⚡ Insanely fast AI voice assistant with <500ms response times
A blazing fast inference solution for text embeddings models
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
code for "TVG: A Training-free Transition Video Generation Method with Diffusion Models"
Fantastic Data Engineering for Large Language Models
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
the AI-native open-source embedding database
Efficient Triton Kernels for LLM Training