diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index c282e9a1..508be8a6 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,6 +9,9 @@ When you submit a pull request, a CLA-bot will automatically determine whether y to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA. +## note +You should sunmit your pull request to the `pre-release` branch, not the `main` branch. + This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. \ No newline at end of file diff --git a/README.md b/README.md index 7aaeb4fb..b45863c7 100644 --- a/README.md +++ b/README.md @@ -49,7 +49,7 @@ Both agents leverage the multi-modal capabilities of GPT-Vision to comprehend th 3. **Extended Application Interaction:** UFO now goes beyond UI controls, allowing interaction with your application through keyboard inputs and native APIs! Presently, we support Word ([examples](/ufo/prompts/apps/word/api.yaml)), with more to come soon. Customize and build your own interactions. 4. **Control Filtering:** Streamline LLM's action process by using control filters to remove irrelevant control items. Enable them in [config_dev.yaml](/ufo/config/config_dev.yaml) under the `control filtering` section at the bottom. - 📅 2024-03-25: **New Release for v0.0.1!** Check out our exciting new features. - 1. We now support creating your help documents for each Windows application to become an app expert. Check the [README](https://microsoft.github.io/UFO/creating_app_agent/help_document_provision/) for more details! + 1. We now support creating your help documents for each Windows application to become an app expert. Check the [documentation](https://microsoft.github.io/UFO/creating_app_agent/help_document_provision/) for more details! 2. UFO now supports RAG from offline documents and online Bing search. 3. You can save the task completion trajectory into its memory for UFO's reference, improving its future success rate! 4. You can customize different GPT models for AppAgent and ActAgent. Text-only models (e.g., GPT-4) are now supported! @@ -141,7 +141,7 @@ UFO also supports other LLMs and advanced configurations, such as customize your If you want to enhance UFO's ability with external knowledge, you can optionally configure it with an external database for retrieval augmented generation (RAG) in the `ufo/config/config.yaml` file. We provide the following options for RAG to enhance UFO's capabilities: -- [Offline Help Document](https://microsoft.github.io/UFO/advanced_usage/reinforce_appagent/learning_from_help_document/)* Enable UFO to retrieve information from offline help documents. +- [Offline Help Document](https://microsoft.github.io/UFO/advanced_usage/reinforce_appagent/learning_from_help_document/) Enable UFO to retrieve information from offline help documents. - [Online Bing Search Engine](https://microsoft.github.io/UFO/advanced_usage/reinforce_appagent/learning_from_bing_search/): Enhance UFO's capabilities by utilizing the most up-to-date online search results. - [Self-Experience](https://microsoft.github.io/UFO/advanced_usage/reinforce_appagent/experience_learning/): Save task completion trajectories into UFO's memory for future reference. - [User-Demonstration](https://microsoft.github.io/UFO/advanced_usage/reinforce_appagent/learning_from_demonstration/): Boost UFO's capabilities through user demonstration. diff --git a/documents/docs/about/CONTRIBUTING.md b/documents/docs/about/CONTRIBUTING.md index c282e9a1..3ac034b3 100644 --- a/documents/docs/about/CONTRIBUTING.md +++ b/documents/docs/about/CONTRIBUTING.md @@ -9,6 +9,9 @@ When you submit a pull request, a CLA-bot will automatically determine whether y to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA. +!!! note + You should sunmit your pull request to the `pre-release` branch, not the `main` branch. + This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. \ No newline at end of file diff --git a/documents/docs/index.md b/documents/docs/index.md index 2ca81f32..bac79968 100644 --- a/documents/docs/index.md +++ b/documents/docs/index.md @@ -24,7 +24,7 @@ - AppAgent 👾, responsible for iteratively executing actions on the selected applications until the task is successfully concluded within a specific application. -- Control Interaction 🎮, is tasked with translating actions from HostAgent and AppAgent into interactions with the application and its UI controls. It's essential that the targeted controls are compatible with the Windows **UI Automation** or **Win32** API. +- Application Automator 🎮, is tasked with translating actions from HostAgent and AppAgent into interactions with the application and through UI controls, native APIs or AI tools. Check out more details [here](./automator/overview.md). Both agents leverage the multi-modal capabilities of Visual Language Model (VLM) to comprehend the application UI and fulfill the user's request. For more details, please consult our [technical report](https://arxiv.org/abs/2402.07939).