-
Harbin Institute of Technology
- Weihai, China
-
02:25
(UTC +08:00) - asdfo123.github.io
- in/xinye-li-5503a3283
Highlights
- Pro
Lists (1)
Sort Last updated
Stars
Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function" (NCFM, Rating: 555) in CVPR 2025.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
GUI Grounding for Professional High-Resolution Computer Use
A simple screen parsing tool towards pure vision based GUI agent
Code for the paper 🌳 Tree Search for Language Model Agents
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…
Janus-Series: Unified Multimodal Understanding and Generation Models
[ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing
A framework for few-shot evaluation of language models.
Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing
List of papers on hallucination detection in LLMs.
GitHub page for "Large Language Model-Brained GUI Agents: A Survey"
A generative world for general-purpose robotics & embodied AI learning.
Evaluating the Ripple Effects of Knowledge Editing in Language Models
WikiFactDiff is a factual atomic knowledge update dataset for LLMs. It describes the evolution of factual knowledge between two dates.
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…