Skip to content
View chris5zk's full-sized avatar
  • National Yang Ming Chiao Tung University
  • Taiwan
  • 18:44 (UTC +08:00)

Block or report chris5zk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Jupyter Notebook 7,804 503 Updated Mar 21, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,697 113 Updated Mar 25, 2025

Witness the aha moment of VLM with less than $3.

Python 3,384 263 Updated Mar 1, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,635 549 Updated Mar 25, 2025

Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"

Python 251 5 Updated Mar 12, 2025

Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"

Python 33 1 Updated Feb 10, 2025

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Python 256 12 Updated Jan 2, 2024

[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Python 52 1 Updated Mar 3, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 42,601 6,457 Updated Mar 25, 2025

High Quality Video Reasoning Segmentation

Python 17 2 Updated Mar 8, 2025

A fork of Distrobox that supports rootless docker

Shell 2 Updated Oct 16, 2024

[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model

13 Updated Jul 20, 2024

YOLOv12: Attention-Centric Real-Time Object Detectors

Python 1,299 162 Updated Mar 18, 2025

[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.

Python 215 5 Updated Feb 11, 2025

Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)

Python 23 Updated Mar 23, 2025

COCO API - Dataset @ http://cocodataset.org/

Jupyter Notebook 6,205 3,768 Updated Apr 17, 2024

[CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"

Python 62 Updated Mar 23, 2025

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Python 25,018 11,719 Updated Jun 7, 2024

[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation

Python 876 45 Updated Mar 12, 2025

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)

Python 11 Updated Mar 14, 2025

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 141,887 28,393 Updated Mar 25, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 3,973 571 Updated Mar 24, 2025

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 984 64 Updated Mar 19, 2025

[ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding

Python 54 3 Updated Oct 22, 2024

LLaVA-Interactive-Demo

Python 367 29 Updated Jul 25, 2024

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing

Java 110,981 13,833 Updated Mar 20, 2025

Master programming by recreating your favorite technologies from scratch.

Markdown 363,026 33,710 Updated Sep 3, 2024

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

JavaScript 121,984 16,372 Updated Mar 18, 2025

Official inference repo for FLUX.1 models

Python 21,009 1,484 Updated Feb 6, 2025

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge managemen…

TypeScript 58,169 12,330 Updated Mar 25, 2025
Next