Skip to content
View zhangyl4's full-sized avatar

Highlights

  • Pro

Block or report zhangyl4

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 13,715 983 Updated Jan 16, 2025

Video Annotation Tool

Vue 185 20 Updated Jun 18, 2024

Data annotation toolbox supports image, audio and video data.

Python 963 95 Updated Jan 16, 2025

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Python 70 3 Updated Jan 15, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 22,815 1,869 Updated Jan 12, 2025

[ECCV 2024] The official PyTorch implementation of the "Part2Object: Hierarchical Unsupervised 3D Instance Segmentation".

Python 18 1 Updated Sep 12, 2024

Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format

Python 26 Updated Jan 8, 2025

Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.

Python 146 8 Updated Dec 8, 2024

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 279 32 Updated Aug 15, 2024

Retrieval and Retrieval-augmented LLMs

Python 8,244 601 Updated Jan 16, 2025

Awesome Online Action Detection

54 5 Updated Dec 9, 2024

[ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".

Python 24 2 Updated Dec 8, 2024
Python 65 5 Updated Dec 6, 2024

Vector (and Scalar) Quantization, in Pytorch

Python 2,825 231 Updated Jan 10, 2025

[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Python 382 34 Updated Oct 23, 2024

OpenMMLab's next-generation platform for general 3D object detection.

Python 5,454 1,568 Updated Jul 10, 2024

MambaOut: Do We Really Need Mamba for Vision?

Python 2,116 37 Updated Oct 22, 2024

Official implementation of "Spherical Mask: Coarse-to-Fine 3D Point Cloud Instance Segmentation with Spherical Representation"

Python 61 5 Updated Apr 28, 2024

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024

Jupyter Notebook 1,430 80 Updated Jun 28, 2024

K-Quant: A Platform of Temporal Financial Knowledge-enhanced Quantitative Investment

Python 54 5 Updated Nov 1, 2024

A little spider which can help you to get your own paper list from https://arxiv.org/ every day.

Python 43 11 Updated Oct 15, 2019

Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.

Python 575 112 Updated Oct 29, 2023

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)

Python 1,796 195 Updated Jan 16, 2025