-
Technology Expert, HUAWEI
- livic.top
Stars
Janus-Series: Unified Multimodal Understanding and Generation Models
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Open 3D Engine (O3DE) is an Apache 2.0-licensed multi-platform 3D engine that enables developers and content creators to build AAA games, cinema-quality 3D worlds, and high-fidelity simulations wit…
This is the top-level repository for the Accel-Sim framework.
Vulkan-Sim is a GPU architecture simulator for Vulkan ray tracing based on GPGPU-Sim and Mesa.
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
Ongoing research training gaussian splatting at scale by distributed system
🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses
A flexible, high-performance 3D simulator for Embodied AI research.
Generating Daylight-driven Architectural Design via Diffusion Models
✨✨Latest Advances on Multimodal Large Language Models
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
One stop solution for all Vulkan samples
Code release for https://kovenyu.com/WonderWorld/
OmniGibson: a platform for accelerating Embodied AI research built upon NVIDIA's Omniverse engine. Join our Discord for support: https://discord.gg/bccR5vGFEx
Productive, portable, and performant GPU programming in Python.
A generative world for general-purpose robotics & embodied AI learning.
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223
CUDA accelerated rasterization of gaussian splatting
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacch…
Production First and Production Ready End-to-End Speech Recognition Toolkit