Skip to content
View lgy0404's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report lgy0404

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects

12 Updated Jan 7, 2025

VisionTasker introduces a novel two-stage framework combining vision-based UI understanding and LLM task planning for mobile task automation in a step-by-step manner.

Python 50 7 Updated Oct 17, 2024

🎬 ScreenToGif allows you to record a selected area of your screen, edit and save it as a gif or video.

C# 24,225 2,202 Updated Dec 15, 2024

beamer template collection

TeX 259 81 Updated Sep 1, 2023

Building a comprehensive and handy list of papers for GUI agents

Python 156 8 Updated Jan 6, 2025

Official style files for papers submitted to venues of the Association for Computational Linguistics

TeX 824 197 Updated Jan 7, 2025

GitHub page for "Large Language Model-Brained GUI Agents: A Survey"

CSS 90 6 Updated Jan 7, 2025

Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Jupyter Notebook 804 42 Updated Jan 8, 2025

An annotated implementation of the Transformer paper.

Jupyter Notebook 5,880 1,253 Updated Apr 7, 2024

Boost LaTeX typesetting efficiency with preview, compile, autocomplete, colorize, and more.

TypeScript 10,873 536 Updated Jan 6, 2025

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 6,693 458 Updated Jan 8, 2025

GPT-4V in Wonderland: LMMs as Smartphone Agents

Python 130 2 Updated Jul 17, 2024
Jupyter Notebook 10 1 Updated Aug 8, 2024

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

Python 549 52 Updated Nov 20, 2024

UGround: Universal GUI Visual Grounding for GUI Agents

Python 128 6 Updated Jan 7, 2025

手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube

Jupyter Notebook 2,326 327 Updated Jul 15, 2024
Python 79 4 Updated Feb 5, 2024

The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".

Python 230 9 Updated Feb 5, 2024

A list of awesome papers and resources of recommender system on large language model (LLM).

1,522 128 Updated Aug 15, 2024

📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉

3,144 211 Updated Jan 8, 2025

An awesome & curated list of best LLMOps tools for developers

Shell 4,219 413 Updated Dec 24, 2024

[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI

993 72 Updated Dec 31, 2024

🚀🚀 「大模型」3小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 3 hours!

Python 4,106 492 Updated Dec 13, 2024

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 681 90 Updated Nov 13, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 7,139 720 Updated Aug 12, 2024

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,059 788 Updated Aug 24, 2023

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)

Python 109 11 Updated Oct 30, 2024

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 79 4 Updated Nov 12, 2024

WONDERBREAD benchmark + dataset for BPM tasks

Jupyter Notebook 22 5 Updated Oct 20, 2024

VisionDroid

Python 7 2 Updated Apr 2, 2024
Next