diff --git a/content/develop/ai/langcache/api-examples.md b/content/develop/ai/langcache/api-examples.md new file mode 100644 index 000000000..7c94eeba2 --- /dev/null +++ b/content/develop/ai/langcache/api-examples.md @@ -0,0 +1,122 @@ +--- +alwaysopen: false +categories: +- docs +- develop +- ai +description: Learn to use the Redis LangCache API for semantic caching. +hideListLinks: true +linktitle: API and SDK examples +title: Use the LangCache API and SDK +weight: 10 +--- + +Use the LangCache API from your client app to store and retrieve LLM, RAG, or agent responses. + +To access the LangCache API, you need: + +- LangCache API base URL +- LangCache service API key +- Cache ID + +When you call the API, you need to pass the LangCache API key in the `Authorization` header as a Bearer token and the Cache ID as the `cacheId` path parameter. + +For example, to search the cache using `cURL`: + +```bash +curl -s -X POST "https://$HOST/v1/caches/$CACHE_ID/entires/search" \ + -H "accept: application/json" \ + -H "Authorization: Bearer $API_KEY" \ + -d "{ 'prompt': 'What is semantic caching' }" +``` + +- The example expects several variables to be set in the shell: + + - **$HOST** - the LangCache API base URL + - **$CACHE_ID** - the Cache ID of your cache + - **$API_KEY** - The LangCache API token + +{{% info %}} +This example uses `cURL` and Linux shell scripts to demonstrate the API; you can use any standard REST client or library. +{{% /info %}} + +You can also use the [LangCache SDKs](#langcache-sdk) for Javascript and Python to access the API. + +## API examples + +### Search LangCache for similar responses + +Use `POST /v1/caches/{cacheId}/entries/search` to search the cache for matching responses to a user prompt. + +```sh +POST https://[host]/v1/caches/{cacheId}/entries/search +{ + "prompt": "User prompt text" +} +``` + +Place this call in your client app right before you call your LLM's REST API. If LangCache returns a response, you can send that response back to the user instead of calling the LLM. + +If LangCache does not return a response, you should call your LLM's REST API to generate a new response. After you get a response from the LLM, you can [store it in LangCache](#store-a-new-response-in-langcache) for future use. + +You can also scope the responses returned from LangCache by adding an `attributes` object to the request. LangCache will only return responses that match the attributes you specify. + +```sh +POST https://[host]/v1/caches/{cacheId}/entries/search +{ + "prompt": "User prompt text", + "attributes": { + "customAttributeName": "customAttributeValue" + } +} +``` + +### Store a new response in LangCache + +Use `POST /v1/caches/{cacheId}/entries` to store a new response in the cache. + +```sh +POST https://[host]/v1/caches/{cacheId}/entries +{ + "prompt": "User prompt text", + "response": "LLM response text" +} +``` + +Place this call in your client app after you get a response from the LLM. This will store the response in the cache for future use. + +You can also store the responses with custom attributes by adding an `attributes` object to the request. + +```sh +POST https://[host]/v1/caches/{cacheId}/entries +{ + "prompt": "User prompt text", + "response": "LLM response text", + "attributes": { + "customAttributeName": "customAttributeValue" + } +} +``` + +### Delete cached responses + +Use `DELETE /v1/caches/{cacheId}/entries/{entryId}` to delete a cached response from the cache. + +You can also use `DELETE /v1/caches/{cacheId}/entries` to delete multiple cached responses at once. If you provide an `attributes` object, LangCache will delete all responses that match the attributes you specify. + +```sh +DELETE https://[host]/v1/caches/{cacheId}/entries +{ + "attributes": { + "customAttributeName": "customAttributeValue" + } +} +``` +## LangCache SDK + +If your app is written in Javascript or Python, you can also use the LangCache Software Development Kits (SDKs) to access the API. + +To learn how to use the LangCache SDKs: + +- [LangCache SDK for Javascript](https://www.npmjs.com/package/@redis-ai/langcache) +- [LangCache SDK for Python](https://pypi.org/project/langcache/) diff --git a/content/develop/ai/langcache/api-reference.md b/content/develop/ai/langcache/api-reference.md index 114075155..f5f708924 100644 --- a/content/develop/ai/langcache/api-reference.md +++ b/content/develop/ai/langcache/api-reference.md @@ -1,129 +1,9 @@ --- -alwaysopen: false -categories: -- docs -- develop -- ai -description: Learn to use the Redis LangCache API for semantic caching. -hideListLinks: true -linktitle: API and SDK reference -title: LangCache API and SDK reference -weight: 10 ---- - -You can use the LangCache API from your client app to store and retrieve LLM, RAG, or agent responses. - -To access the LangCache API, you need: - -- LangCache API base URL -- LangCache service API key -- Cache ID - -When you call the API, you need to pass the LangCache API key in the `Authorization` header as a Bearer token and the Cache ID as the `cacheId` path parameter. - -For example, to check the health of the cache using `cURL`: - -```bash -curl -s -X GET "https://$HOST/v1/caches/$CACHE_ID/health" \ - -H "accept: application/json" \ - -H "Authorization: Bearer $API_KEY" -``` - -- The example expects several variables to be set in the shell: - - - **$HOST** - the LangCache API base URL - - **$CACHE_ID** - the Cache ID of your cache - - **$API_KEY** - The LangCache API token - -{{% info %}} -This example uses `cURL` and Linux shell scripts to demonstrate the API; you can use any standard REST client or library. -{{% /info %}} - -You can also use the [LangCache SDKs](#langcache-sdk) for Javascript and Python to access the API. - -## API examples - -### Check cache health - -Use `GET /v1/caches/{cacheId}/health` to check the health of the cache. - -```sh -GET https://[host]/v1/caches/{cacheId}/health -``` - -### Search LangCache for similar responses - -Use `POST /v1/caches/{cacheId}/entries/search` to search the cache for matching responses to a user prompt. - -```sh -POST https://[host]/v1/caches/{cacheId}/entries/search -{ - "prompt": "User prompt text" -} -``` - -Place this call in your client app right before you call your LLM's REST API. If LangCache returns a response, you can send that response back to the user instead of calling the LLM. - -If LangCache does not return a response, you should call your LLM's REST API to generate a new response. After you get a response from the LLM, you can [store it in LangCache](#store-a-new-response-in-langcache) for future use. - -You can also scope the responses returned from LangCache by adding an `attributes` object to the request. LangCache will only return responses that match the attributes you specify. - -```sh -POST https://[host]/v1/caches/{cacheId}/entries/search -{ - "prompt": "User prompt text", - "attributes": { - "customAttributeName": "customAttributeValue" - } -} -``` - -### Store a new response in LangCache - -Use `POST /v1/caches/{cacheId}/entries` to store a new response in the cache. - -```sh -POST https://[host]/v1/caches/{cacheId}/entries -{ - "prompt": "User prompt text", - "response": "LLM response text" -} -``` - -Place this call in your client app after you get a response from the LLM. This will store the response in the cache for future use. - -You can also store the responses with custom attributes by adding an `attributes` object to the request. - -```sh -POST https://[host]/v1/caches/{cacheId}/entries -{ - "prompt": "User prompt text", - "response": "LLM response text", - "attributes": { - "customAttributeName": "customAttributeValue" - } -} -``` - -### Delete cached responses - -Use `DELETE /v1/caches/{cacheId}/entries/{entryId}` to delete a cached response from the cache. - -You can also use `DELETE /v1/caches/{cacheId}/entries` to delete multiple cached responses at once. If you provide an `attributes` object, LangCache will delete all responses that match the attributes you specify. - -```sh -DELETE https://[host]/v1/caches/{cacheId}/entries -{ - "attributes": { - "customAttributeName": "customAttributeValue" - } -} -``` -## LangCache SDK - -If your app is written in Javascript or Python, you can also use the LangCache Software Development Kits (SDKs) to access the API. - -To learn how to use the LangCache SDKs: - -- [LangCache SDK for Javascript](https://www.npmjs.com/package/@redis-ai/langcache) -- [LangCache SDK for Python](https://pypi.org/project/langcache/) +Title: LangCache REST API +linkTitle: API reference +layout: apireference +type: page +params: + sourcefile: ./api.yaml + sortOperationsAlphabetically: false +--- \ No newline at end of file diff --git a/content/develop/ai/langcache/api-reference/api.yaml b/content/develop/ai/langcache/api-reference/api.yaml new file mode 100644 index 000000000..b8bbd2b55 --- /dev/null +++ b/content/develop/ai/langcache/api-reference/api.yaml @@ -0,0 +1,513 @@ +openapi: 3.0.1 +info: + title: Redis LangCache Service + description: API for managing a [Redis LangCache](https://redis.io/docs/latest/develop/ai/langcache/) service. + contact: + name: Redis + email: support@redis.com + version: '1.0' +tags: + - name: Cache Entries + description: Operations for creating, searching, and deleting cache entries. +servers: + - url: http://localhost:8080 + description: Generated server URL +paths: + /v1/caches/{cacheId}/entries/search: + post: + tags: + - Cache Entries + summary: Search the cache + description: Searches the cache for entries that match the prompt and attributes. If no entries are found, this endpoint returns an empty array. + operationId: search + parameters: + - $ref: '#/components/parameters/cacheId' + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/SearchEntriesRequest' + required: true + responses: + '200': + description: Cache search completed successfully + content: + 'application/json': + schema: + $ref: '#/components/schemas/SearchEntriesResponse' + '400': + description: Bad Request + content: + 'application/json': + schema: + $ref: '#/components/schemas/BadRequestErrorResponseContent' + '401': + description: Unauthorized + content: + 'application/json': + schema: + $ref: '#/components/schemas/AuthenticationErrorResponseContent' + '403': + description: Forbidden + content: + 'application/json': + schema: + $ref: '#/components/schemas/ForbiddenErrorResponseContent' + '500': + description: An unexpected error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/InternalServerErrorResponseContent' + '503': + description: An internal error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/ServiceUnavailableErrorResponseContent' + /v1/caches/{cacheId}/entries: + post: + tags: + - Cache Entries + summary: Add a new entry to the cache + description: Adds an entry to the cache with a prompt and response. + operationId: set + parameters: + - $ref: '#/components/parameters/cacheId' + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/SetEntryRequest' + required: true + responses: + '201': + description: Cache entry added to the cache successfully + content: + 'application/json': + schema: + $ref: '#/components/schemas/SetEntryResponse' + '400': + description: Bad Request + content: + 'application/json': + schema: + $ref: '#/components/schemas/BadRequestErrorResponseContent' + '401': + description: Unauthorized + content: + 'application/json': + schema: + $ref: '#/components/schemas/AuthenticationErrorResponseContent' + '403': + description: Forbidden + content: + 'application/json': + schema: + $ref: '#/components/schemas/ForbiddenErrorResponseContent' + '500': + description: An unexpected error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/InternalServerErrorResponseContent' + '503': + description: An internal error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/ServiceUnavailableErrorResponseContent' + delete: + tags: + - Cache Entries + summary: Delete multiple cache entries + description: Deletes multiple cache entries based on specified attributes. If no attributes are provided, all entries in the cache are deleted. + operationId: deleteQuery + parameters: + - $ref: '#/components/parameters/cacheId' + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/DeleteEntriesRequest' + required: true + responses: + '200': + description: Cache entries successfully deleted + content: + 'application/json': + schema: + $ref: '#/components/schemas/DeleteEntriesResponse' + '400': + description: Bad Request + content: + 'application/json': + schema: + $ref: '#/components/schemas/BadRequestErrorResponseContent' + '401': + description: Unauthorized + content: + 'application/json': + schema: + $ref: '#/components/schemas/AuthenticationErrorResponseContent' + '403': + description: Forbidden + content: + 'application/json': + schema: + $ref: '#/components/schemas/ForbiddenErrorResponseContent' + '500': + description: An unexpected error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/InternalServerErrorResponseContent' + '503': + description: An internal error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/ServiceUnavailableErrorResponseContent' + /v1/caches/{cacheId}/entries/{entryId}: + delete: + tags: + - Cache Entries + summary: Delete a single cache entry + description: Deletes a single cache entry by the entry ID. + operationId: delete + parameters: + - $ref: '#/components/parameters/cacheId' + - $ref: '#/components/parameters/entryId' + responses: + '204': + description: Cache entry successfully deleted + '400': + description: Bad Request + content: + 'application/json': + schema: + $ref: '#/components/schemas/BadRequestErrorResponseContent' + '401': + description: Unauthorized + content: + 'application/json': + schema: + $ref: '#/components/schemas/AuthenticationErrorResponseContent' + '403': + description: Forbidden + content: + 'application/json': + schema: + $ref: '#/components/schemas/ForbiddenErrorResponseContent' + '404': + description: Cache entry not found + content: + 'application/json': + schema: + $ref: '#/components/schemas/NotFoundErrorResponseContent' + '500': + description: An unexpected error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/InternalServerErrorResponseContent' + '503': + description: An internal error occurred + content: + 'application/json': + schema: + $ref: '#/components/schemas/ServiceUnavailableErrorResponseContent' +components: + parameters: + cacheId: + name: cacheId + in: path + description: The cache ID. + required: true + schema: + type: string + entryId: + name: entryId + in: path + description: The ID of the cache entry to delete. + required: true + schema: + type: string + schemas: + SearchEntriesRequest: + required: + - prompt + type: object + properties: + prompt: + type: string + description: The prompt to search for in the cache. + example: How does semantic caching work? + similarityThreshold: + minimum: 0 + maximum: 1 + type: number + description: The minimum similarity threshold for the cache entry (normalized cosine similarity). + format: float + example: 0.9 + attributes: + type: object + additionalProperties: + type: string + description: Key-value pairs of attributes that filter the cache entries. If provided, this endpoint only returns entries that contain all given attributes. + example: + language: en + topic: ai + description: Request for searching a cache entry + SearchEntriesResponse: + required: + - data + type: object + properties: + data: + type: array + items: + $ref: '#/components/schemas/CacheEntry' + description: Array of cache entries matching the search criteria. This array is empty if no entries match the search criteria. + description: Response representing the result of a successful cache entries search operation + CacheEntry: + required: + - attributes + - similarity + - id + - prompt + - response + type: object + properties: + id: + type: string + description: Unique identifier for the cache entry. + example: myIndex:5b84acef3ce360988d1b35adbaaaccb164569b6d79fab04fd888b5fea03fb8f2 + prompt: + type: string + description: The prompt associated with the cache entry. + example: Tell me how semantic caching works + response: + type: string + description: The response associated with the cache entry. + example: Semantic caching stores and retrieves data based on meaning, not exact matches. + attributes: + type: object + additionalProperties: + type: string + description: The key-value pairs of attributes that are associated with the cache entry. + example: + language: en + topic: ai + similarity: + type: number + description: The similarity metric used for similarity comparison. + format: float + example: 0.95 + description: A cache entry + SetEntryRequest: + required: + - prompt + - response + type: object + properties: + prompt: + example: How does semantic caching work? + type: string + description: The prompt for the entry. + response: + example: Semantic caching stores and retrieves data based on meaning, not exact matches. + type: string + description: The response to the prompt for the entry. + attributes: + type: object + additionalProperties: + type: string + description: Key-value pairs of attributes to be associated with the entry. These can be used for filtering when searching for entries. All attribute names that can be associated with an entry must be defined during cache creation. + example: + language: en + topic: ai + ttlMillis: + type: integer + description: The entry's time-to-live, in milliseconds. + format: int64 + description: Request to add a cache entry to the cache + SetEntryResponse: + required: + - entryId + type: object + properties: + entryId: + type: string + description: The ID of the entry that was added to the cache. + description: Response representing a successful cache entry addition + DeleteEntriesRequest: + required: + - attributes + type: object + properties: + attributes: + type: object + additionalProperties: + type: string + description: Key-value pairs of attributes associated with the cache entries to delete. If provided, this endpoint only deletes entries that contain all given attributes. If not provided, this endpoint deletes all entries in the cache. + example: + language: en + topic: ai + description: Request to delete cache entries based on specified attributes + DeleteEntriesResponse: + required: + - deletedEntriesCount + type: object + properties: + deletedEntriesCount: + type: integer + description: The number of cache entries successfully deleted. + format: int64 + example: 42 + description: Response indicating the result of a cache entries deletion operation + BadRequestErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + example: Invalid Request + status: + type: integer + default: 400 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/BadRequestErrorUri' + required: + - status + - title + - type + BadRequestErrorUri: + type: string + enum: + - /errors/invalid-request + AuthenticationErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + status: + type: integer + default: 401 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/AuthenticationErrorUri' + required: + - status + - title + - type + AuthenticationErrorUri: + type: string + enum: + - /errors/unauthenticated + ForbiddenErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + example: Unauthorized + status: + type: integer + default: 403 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/ForbiddenErrorUri' + required: + - status + - title + - type + ForbiddenErrorUri: + type: string + enum: + - /errors/unauthorized + ServiceUnavailableErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + example: Service Unavailable + status: + type: integer + default: 503 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/ServiceUnavailableErrorUri' + required: + - status + - title + - type + ServiceUnavailableErrorUri: + type: string + enum: + - /errors/cache/unexpected-error + - /errors/cache/authentication + - /errors/embeddings/unauthorized + - /errors/embeddings/too-many-requests + - /errors/embeddings/unexpected-error + NotFoundErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + status: + type: integer + default: 404 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/NotFoundErrorUri' + required: + - status + - title + - type + NotFoundErrorUri: + type: string + enum: + - /errors/not-found + InternalServerErrorResponseContent: + type: object + properties: + title: + type: string + description: A short summary of the problem type. + status: + type: integer + default: 500 + description: The HTTP status code generated by the origin server. + detail: + type: string + description: An explanation specific to this problem. + type: + $ref: '#/components/schemas/InternalServerErrorUri' + required: + - status + - title + - type + InternalServerErrorUri: + type: string + enum: + - /errors/unexpected-error diff --git a/layouts/_default/apireference.html b/layouts/_default/apireference.html index ad0ab238b..a8f58da94 100644 --- a/layouts/_default/apireference.html +++ b/layouts/_default/apireference.html @@ -6,6 +6,7 @@ When creating a new API reference page, you can specify these optional params in the front matter: - sourcefile: The path to the OpenAPI specification file (YAML or JSON) relative to the page. Default: ./openapi.json - backLink: The path to the page to link to in the top left corner. Default: Parent page +- sortOperationsAlphabetically: Whether to sort the operations alphabetically or use the order defined in the OpenAPI spec. Default: true Here's what the file structure should look like to use this template: / @@ -25,6 +26,7 @@ params: sourcefile: ./ (optional - only specify if not using ./openapi.json) backLink: (optional - only specify if not using the parent page) + sortOperationsAlphabetically: (optional - default: true) --- Example usage for the Redis Cloud API with all params set: @@ -36,6 +38,7 @@ params: sourcefile: ./openapi.json backLink: operate/rc/api + sortOperationsAlphabetically: true --- --> @@ -106,7 +109,7 @@ spec-url='{{ .Params.sourcefile | default "./openapi.json" }}' scroll-y-offset='#apiReferenceHeader' json-sample-expand-level=all - sort-operations-alphabetically="true" + sort-operations-alphabetically='{{ .Params.sortOperationsAlphabetically | default "true" }}' theme='{ "rightPanel": { "width": "50%"