PuppyGo

Vision language model and large language model powered embodied agent.

Here’s what I did:

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
extracts affordances and constraints from large language models and vision-language models to compose 3D value maps, which are used by motion planners to zero-shot synthesize trajectories for everyday manipulation tasks.
combine with e2e large model trainning framework, like UniAD;

This Package Is Sponsorware 💰💰💰

https://github.com/sponsors/Charmve?frequency=one-time&sponsor=Charmve

This repo was only available to my sponsors on GitHub Sponsors until I reached 15 sponsors.

Learn more about Sponsorware at github.com/sponsorware/docs 💰.

Execution under Disturbances

Because the language model output stays the same throughout the task, we can cache its output and re-evaluate the generated code using closed-loop visual feedback, which enables fast replanning using MPC. This enables VoxPoser to be robust to online disturbances.

"Sort the paper trash into the blue tray."

"Close the top drawer."

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PuppyGo

Here’s what I did:

This Package Is Sponsorware 💰💰💰

Execution under Disturbances

About

Releases

Packages

Charmve/PuppyGo

Folders and files

Latest commit

History

Repository files navigation

PuppyGo

Here’s what I did:

This Package Is Sponsorware 💰💰💰

Execution under Disturbances

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages