ChChwang

ChChwang

1 follower · 8 following

Stars

LMD0311 / HERMES

HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation

92 3 Updated Jan 27, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 15,822 2,083 Updated Feb 1, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,430 1,319 Updated Feb 11, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,051 1,297 Updated Feb 1, 2025

2U1 / Qwen2-VL-Finetune

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 261 34 Updated Feb 13, 2025

skyhehe123 / spconv

Python 12 Updated Jul 18, 2024

WangHuiNEU / Transformer_Knowlegde

从底层机理了解Transformer

26 1 Updated Mar 4, 2022

Haiyang-W / UniTR

[ICCV2023] Official Implementation of "UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation"

Python 302 16 Updated Sep 4, 2024

ADLab-AutoDrive / BEVFusion

Offical PyTorch implementation of "BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework"

Python 782 108 Updated Apr 5, 2023

mit-han-lab / bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Python 2,450 443 Updated Jul 31, 2024

walzimmer / active-anno-3d

Jupyter Notebook 14 2 Updated Apr 23, 2024

traveller59 / second.pytorch

SECOND for KITTI/NuScenes object detection

Python 1,732 720 Updated Oct 14, 2022

AlmoonYsl / OPEN

[ECCV 2024] OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

Python 63 Updated Sep 26, 2024

lichao-sun / SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

495 20 Updated Mar 21, 2024

weitingyuk / LeetCode-Notes-Waiting

Java 40 14 Updated May 23, 2021

h-zhao1997 / cobra

[AAAI-25] Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Python 266 8 Updated Jan 8, 2025

fundamentalvision / BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Python 3,549 571 Updated Aug 15, 2024

ZrrSkywalker / PointCLIP

[CVPR 2022] PointCLIP: Point Cloud Understanding by CLIP

Python 356 33 Updated Nov 24, 2022

azhuantou / HSSDA

[CVPR 2023] Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection

Python 67 5 Updated May 11, 2023

NMS05 / DinoV2-SigLIP-Phi3-LoRA-VLM

Python 17 Updated May 24, 2024

VDIGPKU / BEV-MAE

[AAAI 2024] BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios

Python 55 5 Updated Jan 7, 2025

OpenDriveLab / ELM

[ECCV 2024] Embodied Understanding of Driving Scenarios

Python 174 12 Updated Jan 2, 2025

zhanggang001 / HEDNet

HEDNet (NeurIPS 2023) & SAFDNet (CVPR 2024 Oral)

Python 137 11 Updated Sep 28, 2024

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 27,346 3,437 Updated Jul 23, 2024

BR-IDL / PaddleViT

🤖 PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

Python 1,226 323 Updated Sep 7, 2022

ChChwang / PaddleViT

Forked from BR-IDL/PaddleViT