Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference API] Fix Azure AI Studio Integration for Completions and Embeddings #119818

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

brendan-jugan-elastic
Copy link

This draft PR fixes the Inference API integration with Azure AI Foundry (previously Azure AI Studio). The previous integration was broken for both completions and embeddings models due to API changes from Microsoft.

Core Changes:

  • no longer referencing an approved list of providers
  • introducing a required AzureAiStudioDeploymentType to the service settings
    • either azure_ai_model_inference_service or serverless_api
  • modifying auth configuration for each deployment type
  • slight request format modifications after API changes
  • testing and rebranding changes are in-progress, wanted to get some eyes on the implementation while I complete them

Once testing is complete, I will add more detailed docs describing the deployment types, their configurations, and describe how to use this integration with screenshots from the Azure console.

Local Testing:

Embeddings:

PUT http://localhost:9200/_inference/text_embedding/cohere_serverless_embed

curl --location --request PUT 'http://localhost:9200/_inference/text_embedding/cohere_serverless_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.eastus.models.ai.azure.com/embeddings",
    "deployment_type": "serverless_api",
    "deployment_name": "Cohere-embed-v3-english-hmcek"
  }
}

POST http://localhost:9200/_inference/text_embedding/cohere_serverless_embed

curl --location 'http://localhost:9200/_inference/text_embedding/cohere_serverless_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'



PUT http://localhost:9200/_inference/text_embedding/cohere_amlis_embed

curl --location --request PUT 'http://localhost:9200/_inference/text_embedding/cohere_amlis_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target/models/embeddings",
    "deployment_type": "azure_ai_model_inference_service",
    "deployment_name": "Cohere-embed-v3-english"
  }
}'

curl --location 'http://localhost:9200/_inference/text_embedding/cohere_amlis_embed' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'

Completions:

PUT http://localhost:9200/_inference/completion/cohere_serverless_completion

curl --location --request PUT 'http://localhost:9200/_inference/completion/cohere_serverless_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.eastus.models.ai.azure.com/chat/completions",
    "deployment_type": "serverless_api",
    "deployment_name": "Cohere-command-r"
  }
}'

POST http://localhost:9200/_inference/completion/cohere_serverless_completion

curl --location 'http://localhost:9200/_inference/completion/cohere_serverless_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'



PUT http://localhost:9200/_inference/completion/cohere_amlis_completion

curl --location --request PUT 'http://localhost:9200/_inference/completion/cohere_amlis_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "service": "azureaistudio",
  "service_settings": {
    "api_key": "*****",
    "target": "https://example-target.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview",
    "deployment_type": "azure_ai_model_inference_service",
    "deployment_name": "Cohere-command-r"
  }
}'

POST http://localhost:9200/_inference/completion/cohere_amlis_completion

curl --location 'http://localhost:9200/_inference/completion/cohere_amlis_completion' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic *****' \
--data '{
  "input": "What is Elastic?"
}'

Related Issues:

Helpful Links:

@brendan-jugan-elastic brendan-jugan-elastic changed the title WIP(azure_ai_foundry): fix implementation for completions and embeddings [(WIP) Inference API] Fix Azure AI Foundry Integration for Completions and Embeddings Jan 9, 2025
Copy link
Contributor

@timgrein timgrein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, good stuff! 👏 Gave it a first pass and left some comments, already looking good

@brendan-jugan-elastic brendan-jugan-elastic changed the title [(WIP) Inference API] Fix Azure AI Foundry Integration for Completions and Embeddings [Inference API] Fix Azure AI Studio Integration for Completions and Embeddings Jan 10, 2025
@brendan-jugan-elastic
Copy link
Author

Note: I'm waiting to complete the Azure AI Studio -> Azure AI Foundry renaming until tomorrow. The above commits contain all of the functional/test changes for this Inference API fix.

Copy link
Contributor

@timgrein timgrein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly left comments around changes we need to address with regards to the transport level changes - we can also sync on that, it can be a bit confusing in the beginning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants