Skip to content

Commit 52a5484

Browse files
committed
GenAI Flow
1 parent d1e141f commit 52a5484

File tree

9 files changed

+425
-0
lines changed

9 files changed

+425
-0
lines changed

_config.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,9 @@ navigation:
8585
libraries/radwordsprocessing/editing/find-and-replace:
8686
title: Find and Replace
8787
position: 6
88+
libraries/radwordsprocessing/features/gen-ai-powered-document-insights:
89+
title: GenAI-powered Document Insights
90+
position: 7
8891
libraries/radwordsprocessing/concepts:
8992
title: Concepts
9093
position: 6

libraries/radpdfprocessing/overview.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ The API of RadPdfProcessing contains two different editors, [RadFixedDocumentEd
3232
* Digital signatures
3333
* Signing a document with digital signature.
3434
* Validate digital signature of already signed document.
35+
* GenAI-powered Document Insights
36+
* Accessibility Support
3537

3638
The document model of the library provides support for:
3739

@@ -59,6 +61,7 @@ The document model of the library provides support for:
5961
|[**JavaScript Actions and Trigger Events**]({%slug radpdfprocessing-model-javascript-actions%})|As of Q4 2024 you can import or export the javascript actions associated with pages, form fields, etc. so that they can be executed when the exported document is opened with Adobe Acrobat. |
6062
|[**Accessibility Support**]({%slug create-accessible-pdf-documents%})|Offers accessibility support of documents to users with disabilities.|
6163
| [**Viewer Preferences**]({%slug radpdfprocessing-features-viewer-preferences%}) | Control how PDF documents are displayed and behave in PDF viewers, including window behavior, UI visibility, and print settings. |
64+
|**GenAI-powered Document Insights**|Enables you to easily extract insights from PDF documents using Large Language Models (LLMs). This functionality enables you to summarize document content and ask questions about it, with the AI providing relevant answers based on the document's content. [Read More]({%slug radpdfprocessing-features-gen-ai-powered-document-insights-overview%})|
6265

6366
# See Also
6467

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: CompleteContextQuestionProcessor
3+
description: CompleteContextQuestionProcessor class enables you to ask questions about a Word document and receive answers based on the entire document content.
4+
page_title: CompleteContextQuestionProcessor
5+
slug: radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor
6+
tags: ai, document, analysis, question, processor, complete, context
7+
published: True
8+
position: 5
9+
---
10+
<style>
11+
table, th, td {
12+
border: 1px solid;
13+
}
14+
table th:first-of-type {
15+
width: 30%;
16+
}
17+
table th:nth-of-type(2) {
18+
width: 70%;
19+
}
20+
</style>
21+
22+
# CompleteContextQuestionProcessor
23+
24+
The **CompleteContextQuestionProcessor** class enables you to ask questions about a Word document and receive answers based on the entire document content. This processor sends the complete document text to the AI model, which is suitable for smaller documents or when you need to ensure that the AI model has access to all the information in the document. This class inherits from the abstract **AIProcessorBase** class, which provides common functionality for all AI processors.
25+
26+
The **CompleteContextQuestionProcessor** is ideal for the following scenarios:
27+
28+
1. **Small Documents**: When the document is small enough to fit within the token limit of the AI model.
29+
2. **Holistic Understanding**: When the question requires understanding the entire document context.
30+
3. **Simplicity**: When you don't need the advanced embedding functionality of [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}).
31+
32+
However, if you're working with larger documents or want to optimize token usage, you should use the [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#when-to-use-partialcontextquestionprocessor) instead.
33+
34+
## Public API
35+
36+
|Property|Description|
37+
|---|---|
38+
|**Settings**|Gets the settings for the AI question-answering process. Returns [CompleteContextProcessorSettings]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor%}#completecontextprocessorsettings).|
39+
40+
|Method|Description|
41+
|---|---|
42+
|**public Task<string> AnswerQuestion(ISimpleTextDocument document, string question)**|Answers a question using the provided document. Parameters: **document** - The document containing the text to process, **question** - The question to answer. Returns a task that represents the asynchronous operation. The task result contains the answer to the question.|
43+
44+
>caution **Security Warning:** The output produced by this API is generated by a Large Language Model (LLM). As such, the content should be considered untrusted and may include unexpected or unsafe data. It is strongly recommended to properly sanitize or encode all output before displaying it in a user interface, logging, or using it in any security-sensitive context.
45+
46+
## CompleteContextProcessorSettings
47+
48+
The **CompleteContextProcessorSettings** class provides configuration options for the question-answering process.
49+
50+
### Settings Properties
51+
52+
* **ModelMaxInputTokenLimit**: Gets or sets the maximum input token limit the model allows.
53+
* **TokenizationEncoding**: Gets or sets the tokenization encoding.
54+
* **ModelId**: Gets or sets the ID of the model.
55+
56+
## Usage Example
57+
58+
The following example demonstrates how to use the **CompleteContextQuestionProcessor** to ask questions about a Word document, including working with specific document pages. For setting up the AI client as shown in this example, see the [AI Provider Setup]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%}#ai-provider-setup) section:
59+
60+
#### __[C#] Example 1: Using CompleteContextQuestionProcessor__
61+
62+
<snippet id='libraries-flow-features-gen-ai-ask-questions-using-complete-context'/>
63+
64+
## See Also
65+
66+
* [GenAI-powered Document Insights Overview]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-overview%})
67+
* [Prerequisites]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%})
68+
* [SummarizationProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-summarization-processor%})
69+
* [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%})
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
title: Getting Started
3+
description: Learn how to use the GenAI-powered Document Insights functionality to summarize a Word document with WordsProcessing.
4+
page_title: Overview
5+
slug: radwordsprocessing-features-gen-ai-powered-document-insights-getting-started
6+
tags: ai, document, analysis, overview, word, flow, processing, genai, powered, insights
7+
published: True
8+
position: 2
9+
---
10+
11+
# Getting Started
12+
13+
The following example demonstrates how to use the GenAI-powered Document Insights functionality to summarize a Word document and ask questions about it:
14+
15+
>note The following code snippet is valid for Azure Open AI 9.3. The specific **IChatClient** initialization may be different according to the specific version.
16+
17+
>important For .NET {{site.mindotnetversion}}+ (Target OS Windows) with [Packages for .NET Framework and .NET {{site.mindotnetversion}} and .NET {{site.maxdotnetversion}} for Windows]({%slug available-nuget-packages%}#packages-for-net-framework-and-net-{{site.mindotnetversion}}-and-net-{{site.maxdotnetversion}}-for-windows), an [IEmbeddingsStorage]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#implementing-custom-iembeddingsstorage) implementation is required for the [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}).
18+
19+
#### __[C#] Example 1: Using GenAI-powered Document Insights__
20+
21+
<snippet id='libraries-flow-features-gen-ai-getting-started'/>
22+
23+
When you run this code, the AI will process your document, generate a summary, and answer your questions.
24+
25+
<!-- >note A sample runnable project is available in the Document Processing SDK: [AIConnectorDemo](https://github.com/telerik/document-processing-sdk/tree/master/WordsProcessing/AIConnectorDemo). -->
26+
27+
## See Also
28+
29+
* [Prerequisites]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%})
30+
* [SummarizationProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-summarization-processor%})
31+
* [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%})
32+
* [Custom IEmbeddingsStorage Implementation]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#implementing-custom-iembeddingsstorage)
33+
* [CompleteContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor%})
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title: Overview
3+
description: Learn more about the GenAI-powered Document Insights feature of the WordsProcessing library.
4+
page_title: Overview
5+
slug: radwordsprocessing-features-gen-ai-powered-document-insights-overview
6+
tags: ai, document, analysis, overview, word, processing, genai, powered, insights
7+
published: True
8+
position: 0
9+
---
10+
11+
# GenAI-powered Document Insights Overview
12+
13+
The GenAI-powered Document Insights feature enables you to easily extract insights from Word documents using Large Language Models (LLMs). This functionality allows you to summarize document content and ask questions about the document, with the AI providing relevant answers based on the document's content.
14+
15+
## Key Features
16+
17+
* **Extract Document Insights**: Quickly understand the key points of lengthy documents.
18+
* **Efficient Information Retrieval**: Ask specific questions about your documents and receive accurate answers.
19+
* **Token Optimization**: Reduce token usage by only sending relevant portions of the document to the AI model as shown in the [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#when-to-use-partialcontextquestionprocessor) section.
20+
* **Multiple LLM Support**: Compatible with different AI providers including Azure OpenAI, OpenAI, and Ollama as described in the [Prerequisites]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%}#ai-provider-setup).
21+
22+
The GenAI-powered Document Insights feature includes three main components:
23+
24+
|Processor|Description|
25+
|----|----|
26+
|**[SummarizationProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-summarization-processor%})**|Generates concise summaries of Word documents.|
27+
|**[CompleteContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor%})**|Answers questions by providing the entire document content to the AI model.|
28+
|**[PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%})**|Answers questions by providing only the relevant portions of the document to the AI model.|
29+
30+
## See Also
31+
32+
* [Prerequisites]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%})
33+
* [Getting Started]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-getting-started%})
34+
* [SummarizationProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-summarization-processor%})
35+
* [PartialContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%})
36+
* [CompleteContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor%})
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
title: PartialContextQuestionProcessor
3+
description: PartialContextQuestionProcessor class enables you to ask questions about a Word document and receive answers based on the most relevant parts of the document content.
4+
page_title: PartialContextQuestionProcessor
5+
slug: radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor
6+
tags: ai, document, analysis, question, processor, partial, context, embeddings
7+
published: True
8+
position: 4
9+
---
10+
<style>
11+
table, th, td {
12+
border: 1px solid;
13+
}
14+
table th:first-of-type {
15+
width: 65%;
16+
}
17+
table th:nth-of-type(2) {
18+
width: 10%;
19+
}
20+
table th:nth-of-type(3) {
21+
width: 25%;
22+
}
23+
</style>
24+
25+
# PartialContextQuestionProcessor
26+
27+
The **PartialContextQuestionProcessor** class enables you to ask questions about a Word document and receive answers based on the most relevant parts of the document content. This processor uses embeddings to identify and send only the relevant portions of the document to the AI model, making it more efficient for token usage and more suitable for large documents. This class inherits from the abstract **AIProcessorBase** class, which provides common functionality for all AI processors.
28+
29+
The **PartialContextQuestionProcessor** is ideal for the following scenarios:
30+
31+
1. **Large Documents**: When the document exceeds the token limit of the AI model and cannot be processed in a single call.
32+
2. **Efficient Token Usage**: When you want to minimize token consumption and optimize costs.
33+
3. **Specific Questions**: When questions are targeted at specific information within the document rather than requiring complete document understanding.
34+
35+
## Public API and Configuration
36+
37+
|Constructor|Platform|Description|
38+
|---|---|---|
39+
|**PartialContextQuestionProcessor(IChatClient chatClient, int modelMaxInputTokenLimit, ISimpleTextDocument document)**|_Specific*_ |Creates an instance with built-in embeddings storage|
40+
|**PartialContextQuestionProcessor(IChatClient chatClient, IEmbeddingsStorage embeddingsStorage, int modelMaxInputTokenLimit, ISimpleTextDocument document)**|Any|Creates an instance with custom embeddings storage|
41+
42+
> _*Specific_ The .NET {{site.mindotnetversion}}+ (Target OS Windows) + [Packages for .NET Framework and .NET {{site.mindotnetversion}} and .NET {{site.maxdotnetversion}} for Windows]({%slug available-nuget-packages%}#packages-for-net-framework-and-net-{{site.mindotnetversion}}-and-net-{{site.maxdotnetversion}}-for-windows) constructor uses **DefaultEmbeddingsStorage** internally, while the cross-platform constructor requires a custom implementation of **IEmbeddingsStorage** as shown in the [Custom IEmbeddingsStorage Setup]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#implementing-custom-iembeddingsstorage) section.
43+
44+
### Properties and Methods
45+
46+
|Member|Type|Description|
47+
|---|---|---|
48+
|**Settings**|Property|Gets the **PartialContextProcessorSettings** for configuring the AI process|
49+
|**AnswerQuestion(string question)**|Method|Returns an answer to the question using relevant document context|
50+
51+
>caution **Security Warning:** The output produced by this API is generated by a Large Language Model (LLM). As such, the content should be considered untrusted and may include unexpected or unsafe data. It is strongly recommended to properly sanitize or encode all output before displaying it in a user interface, logging, or using it in any security-sensitive context.
52+
53+
### PartialContextProcessorSettings
54+
55+
The settings class provides configuration options for the question-answering process:
56+
57+
* **ModelMaxInputTokenLimit**: Maximum input token limit the model allows
58+
* **TokenizationEncoding**: Tokenization encoding used
59+
* **ModelId**: ID of the AI model
60+
* **MaxNumberOfEmbeddingsSent**: Maximum number of context chunks sent (default: 30)
61+
* **EmbeddingTokenSize**: Size in tokens of each context chunk (default: 300)
62+
63+
## Usage Examples
64+
65+
#### Example 1: Using PartialContextQuestionProcessor with default embeddings storage.
66+
67+
This example demonstrates how to use the **PartialContextQuestionProcessor** with the built-in embeddings storage on .NET {{site.mindotnetversion}}+ (Target OS Windows) + [Packages for .NET Framework and .NET {{site.mindotnetversion}} and .NET {{site.maxdotnetversion}} for Windows]({%slug available-nuget-packages%}#packages-for-net-framework-and-net-{{site.mindotnetversion}}-and-net-{{site.maxdotnetversion}}-for-windows). For setting up the AI client, see the [AI Provider Setup]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%}#ai-provider-setup) section:
68+
69+
<snippet id='libraries-flow-features-gen-ai-ask-questions-using-partial-context'/>
70+
71+
#### Example 2: Using PartialContextQuestionProcessor with Custom Embeddings (.NET Standard/.NET Framework)
72+
73+
This example demonstrates how to use the **PartialContextQuestionProcessor** with a custom embeddings storage implementation as described in the [Custom IEmbeddingsStorage Setup]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-partial-context-question-processor%}#implementing-custom-iembeddingsstorage) section:
74+
75+
<snippet id='libraries-flow-features-gen-ai-ask-questions-using-partial-context-iembeddingsstorage'/>
76+
77+
### Implementing custom IEmbeddingsStorage
78+
79+
A sample custom implementation for the OllamaEmbeddingsStorage is shown in the below code snippet:
80+
81+
>note Requires installing the following NuGet packages:
82+
> * **LangChain**
83+
> * **LangChain.Databases.Sqlite**
84+
> * **Microsoft.Extensions.AI.Ollama**
85+
> * **Telerik.Windows.Documents.AIConnector**
86+
> * **Telerik.Windows.Documents.Fixed**
87+
88+
1. Install Ollama from [ollama.com](https://ollama.com/).
89+
2. Pull the model you want to use.
90+
3. Start the Ollama server.
91+
92+
<snippet id='libraries-pdf-features-gen-ai-ask-questions-using-partial-context-ollama-embeddings-storage'/>
93+
94+
#### Example 3: Processing Specific Pages
95+
96+
<snippet id='libraries-flow-features-gen-ai-summarize-process-specific-pages'/>
97+
98+
#### Example 4: Optimizing Embeddings Settings
99+
100+
<snippet id='libraries-flow-features-gen-ai-summarize-optimize-embeddings-storage'/>
101+
102+
## See Also
103+
104+
* [GenAI-powered Document Insights Overview]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-overview%})
105+
* [Prerequisites]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-prerequisites%})
106+
* [SummarizationProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-summarization-processor%})
107+
* [CompleteContextQuestionProcessor]({%slug radwordsprocessing-features-gen-ai-powered-document-insights-complete-context-question-processor%})

0 commit comments

Comments
 (0)