Skip to content
View lizhaoliu-Lec's full-sized avatar
🎯
Focusing
🎯
Focusing
  • South China University of Technology
  • Guangzhou/China

Block or report lizhaoliu-Lec

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 3,492 288 Updated Aug 14, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,509 999 Updated Oct 8, 2024

Enable macOS HiDPI and have a native setting.

Shell 8,683 995 Updated Jul 3, 2024

official implementation for ECCV 2024 paper "Prioritized Semantic Learning for Zero-shot Instance Navigation"

Python 13 2 Updated Sep 25, 2024
7 Updated Jul 16, 2024

MambaOut: Do We Really Need Mamba for Vision?

Python 1,983 34 Updated Jun 6, 2024

This is the source code for Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy (ICLR2024).

Python 40 2 Updated Aug 12, 2024

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

1,055 73 Updated Oct 8, 2024

Grok open release

Python 49,478 8,325 Updated Aug 30, 2024
Jupyter Notebook 359 45 Updated Dec 5, 2023

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,746 455 Updated May 3, 2024

SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation

Python 94 8 Updated Jan 12, 2024

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 708 45 Updated Jul 29, 2024

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

Python 183 10 Updated Sep 7, 2023

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 753 37 Updated Jun 2, 2024

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Python 10,363 1,549 Updated Oct 7, 2024
1 Updated Jan 19, 2024

RelTR: Relation Transformer for Scene Graph Generation: https://arxiv.org/abs/2201.11460v2

Python 251 51 Updated Aug 20, 2024

Dual Regression Compression for SR Models

Python 5 Updated Jan 8, 2024
Python 15 Updated Dec 13, 2023

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023

Python 243 15 Updated Jun 7, 2023
Jupyter Notebook 755 72 Updated Aug 7, 2024

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Python 498 25 Updated Jun 11, 2024
Python 342 13 Updated Jul 29, 2024
Python 8,373 491 Updated Oct 9, 2024

Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.

Python 538 108 Updated Oct 29, 2023

This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.

Jupyter Notebook 681 61 Updated Oct 17, 2023

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 479 29 Updated May 8, 2024

Open-Set Grounded Text-to-Image Generation

Python 1,990 148 Updated Mar 6, 2024

Grounded Language-Image Pre-training

Python 2,187 192 Updated Jan 24, 2024
Next