Skip to content

Commit

Permalink
readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Chaoyun Zhang committed Feb 14, 2024
1 parent 91eeab3 commit 56777ee
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,15 @@
- <b>ActAgent ๐Ÿ‘พ</b>, responsible for iteratively executing actions on the selected applications until the task is successfully concluded within a specific application.
- <b>Control Interaction ๐ŸŽฎ</b>, is tasked with translating actions from AppAgent and ActAgent into interactions with the application and its UI controls. It's essential that the targeted controls are compatible with the Windows **UI Automation** API.

Both agents leverage the multi-modal capabilities of GPT-Vision to comprehend the application UI and fulfill the user's request. For more details, please consult our [technical report](./assets/UFO_paper.pdf).
Both agents leverage the multi-modal capabilities of GPT-Vision to comprehend the application UI and fulfill the user's request. For more details, please consult our [technical report](https://arxiv.org/abs/2402.07939).
<h1 align="center">
<img src="./assets/framework.png"/>
</h1>


## ๐Ÿ†• News
- ๐Ÿ“… 2024-02-10 UFO is released on GitHub๐ŸŽˆ. Happy Chinese New year๐Ÿ‰!
- ๐Ÿ“… 2024-02-14 Our [technical report](https://arxiv.org/abs/2402.07939) is online!


## ๐Ÿ’ฅ Highlights
Expand Down Expand Up @@ -122,7 +123,7 @@ You may use them to debug, replay, or analyze the agent output.

## ๐ŸŽฌ Demo Examples

We present two demo videos that complete user request on Windows OS using UFO. For more case study, please consult our [technical report](./assets/UFO_paper.pdf).
We present two demo videos that complete user request on Windows OS using UFO. For more case study, please consult our [technical report](https://arxiv.org/abs/2402.07939).

#### 1๏ธโƒฃ๐Ÿ—‘๏ธ Example 1: Deleting all notes on a PowerPoint presentation.
In this example, we will demonstrate how to efficiently use UFO to delete all notes on a PowerPoint presentation with just a few simple steps. Explore this functionality to enhance your productivity and work smarter, not harder!
Expand All @@ -143,7 +144,7 @@ https://github.com/microsoft/UFO/assets/11352048/aa41ad47-fae7-4334-8e0b-ba71c4f

## ๐Ÿ“Š Evaluation

Please consult the [WindowsBench](./assets/UFO_paper.pdf) provided in Section A of the Appendix within our technical report. Here are some tips (and requirements) to aid in completing your request:
Please consult the [WindowsBench](https://arxiv.org/pdf/2402.07939.pdf) provided in Section A of the Appendix within our technical report. Here are some tips (and requirements) to aid in completing your request:

- Prior to UFO execution of your request, ensure that the targeted application is active (though it may be minimized).
- Occasionally, requests to GPT-V may trigger content safety measures. UFO will attempt to retry regardless, but adjusting the size or scale of the application window may prove helpful. We are actively solving this issue.
Expand All @@ -153,12 +154,13 @@ Please consult the [WindowsBench](./assets/UFO_paper.pdf) provided in Section A


## ๐Ÿ“š Citation
Our technical report paper can be found [here](./assets/UFO_paper.pdf).
Our technical report paper can be found [here](https://arxiv.org/abs/2402.07939).
If you use UFO in your research, please cite our paper:
```
@article{ufo,
title={UFO: A UI-Focused Agent for Windows OS Interaction},
author={Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang},
journal={arXiv preprint arXiv:2402.07939},
year={2024}
}
```
Expand Down

0 comments on commit 56777ee

Please sign in to comment.