UFO currently supports processing help documents in XML format, as this is the default format for official help documents of Microsoft apps. More formats will be supported in the future.
You can write a dedicated document for a specific task of an app in a file named, for example, task.xml
. Note that it should be accompanied by a metadata file with the same prefix, but with the .meta
extension, i.e., task.xml.meta
. This metadata file should have a title
describing the task at a high level and a Content-Summary
field summarizing the content of the help document. These two files are used for similarity search with user requests, so please write them carefully. The ppt-copilot.xml and ppt-copilot.xml.meta are examples of a help document and its metadata.
Once you have all help documents and metadata ready, put all of them into a folder. There can be sub-folders for the help documents, but please ensure that each help document and its corresponding metadata are placed in the same directory.
Once you have all documents ready in a folder named path_of_the_docs
, you can easily create an offline indexer to support RAG for UFO. Follow these steps:
# assume you are in the cloned UFO folder
python -m learner --app <app_name> --docs <path_of_the_docs>
Replace app_name
with the name of the application, such as PowerPoint or WeChat.
Note: Ensure the
app_name
is accurately defined as it is used to match the offline indexer in online RAG.
Replace path_of_the_docs
with the full path to the folder containing all your documents.
This command will create an offline indexer for all documents in the path_of_the_docs
folder using Faiss and embedding with sentence transformer (more embeddings will be supported soon). The created index by default will be placed here.