Skip to content
View ByZ0e's full-sized avatar

Block or report ByZ0e

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,461 138 Updated Oct 4, 2024

✨✨Latest Advances on Multimodal Large Language Models

12,028 769 Updated Oct 6, 2024

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目

1,597 165 Updated Sep 29, 2024

This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.

113 6 Updated Aug 12, 2024

A large-scale benchmark and learning environment.

Python 1,124 229 Updated Aug 6, 2024

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

C 366 83 Updated Jul 25, 2024
Python 25 3 Updated Sep 22, 2024

Simple and easily configurable grid world environments for reinforcement learning

Python 2,090 604 Updated Sep 3, 2024

[ICML 2024] Official code repository for 3D embodied generalist agent LEO

Python 346 30 Updated Jul 30, 2024

Code for the Paper M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models.

Python 8 Updated Aug 16, 2024

AI2-THOR Data Collection Tool Based On Keyboard Interaction

Python 54 10 Updated Jun 21, 2024
Python 3 Updated Jul 16, 2024

[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.

Python 156 16 Updated May 23, 2024

A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"

360 19 Updated May 2, 2024

A generative world for general-purpose robotics & embodied AI learning.

332 10 Updated Mar 1, 2024

Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Python 116 6 Updated Mar 17, 2024

Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"

Python 81 4 Updated Mar 20, 2024

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 14,879 1,379 Updated Sep 5, 2024

Inference code for Llama models

Python 55,851 9,513 Updated Aug 18, 2024

This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)

Python 20 6 Updated Jun 28, 2024

A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites

2,830 231 Updated Sep 9, 2024

Recent LLM-based CV and related works. Welcome to comment/contribute!

832 35 Updated Jun 5, 2024

The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)

Python 509 16 Updated Jun 27, 2024

Reference implementations of several LangChain agents as Streamlit apps

Python 1,259 621 Updated Aug 4, 2024

[CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"

Python 51 Updated Aug 8, 2023

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,703 1,852 Updated Jun 27, 2024

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

Python 15,179 2,287 Updated Sep 25, 2024
Next