-
Tencent
- China
-
10:39
(UTC +08:00)
Lists (17)
Sort Name ascending (A-Z)
- All languages
- AppleScript
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- D
- Dart
- Dockerfile
- F#
- Go
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Lua
- MATLAB
- Makefile
- Markdown
- Nix
- PLpgSQL
- Perl
- Python
- Rich Text Format
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Stan
- Swift
- Tcl
- TeX
- TypeScript
- Typst
- V
- VHDL
- Vala
- Verilog
- Vim Script
- Vue
- XSLT
Starred repositories
[CVPR 2025] "Towards Universal Soccer Video Understanding".
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DeepEP: an efficient expert-parallel communication library
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Everything about the SmolLM2 and SmolVLM family of models
Pretraining code for a large-scale depth-recurrent language model
A minimal, easy-to-read PyTorch reimplementation of the Qwen2 series—without the complexity of larger frameworks.
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
⚡ TabPFN: Foundation Model for Tabular Data ⚡
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).
Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs
小红书 (xiaohongshu, rednote) ai运营助手,包括小红书风格内容(包含图片)的生成和自动发布两部分,其中自动发布利用selenium实现RPA模拟点击,将生成内容和封面图和内容图自动发布
An implementation of the TrueSkill rating system for Python
A Comprehensive Benchmark for Document Parsing and Evaluation
Align Anything: Training All-modality Model with Feedback
The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning …
Deep learning software for colorizing black and white images with a few clicks.
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Synthetic data generation pipelines for text-rich images.
💥 Blazing fast terminal file manager written in Rust, based on async I/O.
Python tool for converting files and office documents to Markdown.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
ETL, Analytics, Versioning for Unstructured Data
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.