[Inference API] Default eis endpoint #119694

maxhniebergall · 2025-01-07T18:37:36Z

This PR adds the first iteration model id for the elastic inference service.

Model id: rainbow-sprinkles

The default endpoint id: .rainbow-sprinkles-elastic

elasticsearchmachine · 2025-01-07T18:38:00Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2025-01-07T18:38:01Z

Hi @maxhniebergall, I've created a changelog YAML for you.

… entity

…inference/services/elastic/completion/EISCompletionServiceSettingsTests.java Co-authored-by: Tim Grein <[email protected]>

…inference/services/elastic/completion/EISCompletionModelTests.java Co-authored-by: Tim Grein <[email protected]>

…ISEndpoint

elasticsearchmachine · 2025-01-13T15:59:27Z

@maxhniebergall please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

jaybcee · 2025-01-13T17:45:06Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

+        if (DEFAULT_EIS_ENDPOINT_IDS.contains(inferenceEntityId)) {
+            parsedModelListener.onFailure(
+                new ElasticsearchStatusException(
+                    "[{}] is a reserved inference Id. Cannot create a new inference endpoint with a reserved Id",
+                    RestStatus.BAD_REQUEST,
+                    inferenceEntityId
+                )
+            );
+            return;
+        }
+


Nothing wrong here, but I don't quite understand whats going on. Can you explain

What this is loosely?

How could this happen?

What is the result of it happening? Does ES fail to boot?

Yeah sure:

What this is loosely?

During a PUT request to persist a new inference endpoint, this could is checking to see if the inferenceEntityId that the user is requesting for the new endpoint matches one of the default ones we've reserved. We don't allow that because it could cause clashes and we wouldn't know how to handle inference requests against that inference endpoint.

How could this happen?

This can happen when a user is creating a new inference endpoint like this:

PUT http://localhost:9200/_inference/.eis-alpha-1 { "service": "elastic", "task_type": "completion", "service_settings": { ... } }

It's protecting us from a user trying to use a reserved value for their inference endpoint id

What is the result of it happening? Does ES fail to boot?

The user's PUT request will result in a 400 error. I don't believe this code path would be executed during boot up.

Nice! Let's move this check to TransportPutInferenceModelAction then the check will apply to all the services.

ModelRegistry knows about the default Ids via the addDefaultIds(defaultIds) function which is called by the InferencePlugin at node start up.

In TransportPutInferenceModelAction#masterOperation the same check can be made by querying ModelRegistry

jonathan-buttner · 2025-01-13T20:29:52Z

@elasticmachine test this please

jonathan-buttner · 2025-01-14T13:40:46Z

@elasticmachine test this please

joshdevins · 2025-01-14T14:09:37Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

    private static final EnumSet<TaskType> supportedTaskTypes = EnumSet.of(TaskType.SPARSE_EMBEDDING, TaskType.COMPLETION);
    private static final String SERVICE_NAME = "Elastic";
+    private static final String DEFAULT_EIS_CHAT_COMPLETION_MODEL_ID_V1 = "rainbow-sprinkles";
+    private static final String DEFAULT_EIS_CHAT_COMPLETION_ENDPOINT_ID_V1 = ".eis-alpha-1";


Can we name this the same way we name other default endpoints? I think we settled on modelId-providerName which would mean: .rainbow-sprinkles-elastic?

joshdevins · 2025-01-14T14:14:28Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

+        return List.of(v1DefaultCompletionModel());
+    }
+
+    private ElasticInferenceServiceCompletionModel v1DefaultCompletionModel() {


Is this initialization of the default endpoint conditional on the health check? We want to add an ACL in the gateway which will dynamically control if EIS is available or not. In the case that this customer does not have access to EIS, will it skip creating the default endpoint?

That's a great point. The current implementation does not consider that. I believe I also read that individual models might be available separately for different customers. Which means that we'd need EIS enabled for the customer, and this particular model.

@timgrein @maxjakob @jaybcee

Since it sounds like we're going to keep the information in memory to determine what is available to this cluster, I wonder if we could pass that information along via the ElasticInferenceServiceComponents in the constructor 🤔 . Although if we ever go with a caching approach then the default inference endpoints should attempt to be created on each call to retrieve the defaults 🤔

I think we should postpone merging this PR until the logic is merged to handle determine if EIS is available and which models are available.

jonathan-buttner · 2025-01-24T22:32:58Z

Closing this in favor of this PR: #120847

maxhniebergall added >enhancement :ml Machine learning v9.0.0 v8.18.0 labels Jan 7, 2025

elasticsearchmachine added the Team:ML Meta label for the ML team label Jan 7, 2025

maxhniebergall marked this pull request as draft January 7, 2025 20:39

jonathan-buttner and others added 22 commits January 9, 2025 09:37

Starting completion model

0fb460a

Adding model

8ef932a

initial implementation of request and response handling, manager, and…

bb97600

… entity

Working response from openai

31f3f2c

Update docs/changelog/118301.yaml

7213932

Fixing comment

ae9dbf7

Adding some initial tests

147ba77

Moving tests around

9d4e02e

Address some TODOs

9f78b40

Remove a TODO

b09b3f5

[CI] Auto commit changes from spotless

b92724b

Delete docs/changelog/118301.yaml

25ab348

Rename EISUnifiedChatCompletionResponseHandler

7168dc6

Renames to ElasticInferenceServiceUnifiedCompletionRequestManager

91daa01

Renames EISUnifiedChatCompletionRequest

7842b6c

Renames and comments

7f44d04

propagateTraceContext extraction

afc8ebc

Clean up trace

346ceba

Address comments

a206fab

Update x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/…

cef606d

…inference/services/elastic/completion/EISCompletionServiceSettingsTests.java Co-authored-by: Tim Grein <[email protected]>

Update x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/…

a5e91d9

…inference/services/elastic/completion/EISCompletionModelTests.java Co-authored-by: Tim Grein <[email protected]>

[CI] Auto commit changes from spotless

6933c49

elasticsearchmachine and others added 2 commits January 9, 2025 09:50

[CI] Auto commit changes from spotless

ef802b0

fix merge conflicts

986654b

maxhniebergall force-pushed the defaultEISEndpoint branch from f8972cf to 986654b Compare January 9, 2025 14:59

maxhniebergall added 2 commits January 9, 2025 10:00

remove uncessary comment

1d79df7

remove todo

fa74489

maxhniebergall assigned jonathan-buttner Jan 9, 2025

Replace local constant with class variable

5b1a509

jonathan-buttner added >non-issue auto-backport Automatically create backport pull requests when merged and removed >enhancement labels Jan 10, 2025

maxhniebergall and others added 4 commits January 10, 2025 13:25

Delete docs/changelog/119694.yaml

1cd60f6

Merge branch 'main' into defaultEISEndpoint

5a74d97

Adding the model id name and refactoring

8657c09

Merge branch 'main' of github.com:elastic/elasticsearch into defaultE…

69868af

…ISEndpoint

jonathan-buttner requested review from davidkyle, timgrein and jaybcee January 13, 2025 16:01

jaybcee reviewed Jan 13, 2025

View reviewed changes

jonathan-buttner and others added 2 commits January 13, 2025 14:15

Refactoring so we only create the model once

b06af9b

Merge branch 'main' into defaultEISEndpoint

c13fc53

jonathan-buttner marked this pull request as ready for review January 14, 2025 13:35

Merge branch 'main' into defaultEISEndpoint

cb219e6

joshdevins requested changes Jan 14, 2025

View reviewed changes

jonathan-buttner mentioned this pull request Jan 24, 2025

[ML] Add default Elastic Inference Service chat completion endpoint #120847

Draft

jonathan-buttner closed this Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference API] Default eis endpoint #119694

[Inference API] Default eis endpoint #119694

maxhniebergall commented Jan 7, 2025 •

edited by jonathan-buttner

Loading

elasticsearchmachine commented Jan 7, 2025

elasticsearchmachine commented Jan 7, 2025

elasticsearchmachine commented Jan 13, 2025

jaybcee Jan 13, 2025

jonathan-buttner Jan 13, 2025

jaybcee Jan 13, 2025

davidkyle Jan 16, 2025

jonathan-buttner commented Jan 13, 2025

jonathan-buttner commented Jan 14, 2025

joshdevins Jan 14, 2025 •

edited

Loading

joshdevins Jan 14, 2025

jonathan-buttner Jan 14, 2025

jonathan-buttner commented Jan 24, 2025

[Inference API] Default eis endpoint #119694

[Inference API] Default eis endpoint #119694

Conversation

maxhniebergall commented Jan 7, 2025 • edited by jonathan-buttner Loading

elasticsearchmachine commented Jan 7, 2025

elasticsearchmachine commented Jan 7, 2025

elasticsearchmachine commented Jan 13, 2025

jaybcee Jan 13, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 13, 2025

Choose a reason for hiding this comment

jaybcee Jan 13, 2025

Choose a reason for hiding this comment

davidkyle Jan 16, 2025

Choose a reason for hiding this comment

jonathan-buttner commented Jan 13, 2025

jonathan-buttner commented Jan 14, 2025

joshdevins Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

joshdevins Jan 14, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 14, 2025

Choose a reason for hiding this comment

jonathan-buttner commented Jan 24, 2025

maxhniebergall commented Jan 7, 2025 •

edited by jonathan-buttner

Loading

joshdevins Jan 14, 2025 •

edited

Loading