Starred repositories
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
Fast, Lightweight, Unified Engine for Text2Image Diffusion Models
🎥 Python and OpenCV-based scene cut/transition detection program & library.
Understanding R1-Zero-Like Training: A Critical Perspective
A lightweight library for building Multimodal Agents. Give LLMs superpowers like memory, knowledge, tools and reasoning.
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Official Implementation of "KBLaM: Knowledge Base augmented Language Model"
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message se…
Train your AI self, amplify you, bridge the world
tulip-berkeley / open_clip
Forked from mlfoundations/open_clipAn open source implementation of CLIP (With TULIP Support)
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
🪄 Create rich visualizations with AI
A Datacenter Scale Distributed Inference Serving Framework
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
A simple routine to download and parse all PDF attachments in a Zotero library.
"AI-Researcher: Fully-Automated Scientific Discovery with LLM Agents" & "Open-Sourced Alternative to Google AI Co-Scientist"
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
Source code for ICLR2025 paper "NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation".
Use PEFT or Full-parameter to finetune 500+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Llama3.2-Vision, Llava…
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
Laminar - open-source all-in-one platform for engineering AI products. Crate data flywheel for you AI app. Traces, Evals, Datasets, Labels. YC S24.
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning