Stars
A quick intro to IPSec on the kernel side with STUN and UDP hole punching
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
GPT4V-level open-source multi-modal model based on Llama3-8B
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Hackable and optimized Transformers building blocks, supporting a composable construction.
A high-throughput and memory-efficient inference and serving engine for LLMs
a state-of-the-art-level open visual language model | 多模态预训练模型
A convenient and user-friendly anime-style image data processing library that integrates various advanced anime-style image processing models
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
set prompt to divided region
AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI
Segment Anything in High Quality [NeurIPS 2023]
A Gradio web UI for Large Language Models with support for multiple inference backends.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.