Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fix loss of context in the inference API for streaming APIs #118999

Merged

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Dec 18, 2024

This PR fixes an issue where the X-elastic-produce header was not being returned sometimes. I believe this was a concurrency issue in someway because when testing using the _stream API the header is usually not returned. But when testing with the _unified API the header is usually returned.

I believe the issue relates to the fact that the rest handler's listener doesn't immediately return the response. The onResponse method creates a new listener that is called sometime later (maybe on a different thread?).

One thing I noticed is that the headers are only returned once which is why I only preserve the context on the first action listener.

TODO

  • Adds some unit tests ✅ (but they don't seem to help)
    • Find some other way to test this
  • Pass in a SetOnce instead of the ThreadPool to avoid a potential issue where the createComponents() call gets moved after the route retrieval. ✅

Fixes #119000

@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've updated the changelog YAML for you.

@jonathan-buttner jonathan-buttner changed the title [ML] Fix lose of context in the inference API for streaming APIs [ML] Fix loss of context in the inference API for streaming APIs Dec 19, 2024
var responseConsumer = new AsyncInferenceResponseConsumer();
request.setOptions(RequestOptions.DEFAULT.toBuilder().setHttpAsyncResponseConsumerFactory(() -> responseConsumer).build());
var latch = new CountDownLatch(1);
client().performRequestAsync(request, new ResponseListener() {
@Override
public void onSuccess(Response response) {
if (responseConsumerCallback != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a way to get the response so we can check the headers

@@ -469,7 +477,7 @@ public void testSupportedStream() throws Exception {

var input = IntStream.range(1, 2 + randomInt(8)).mapToObj(i -> randomAlphanumericOfLength(5)).toList();
try {
var events = streamInferOnMockService(modelId, TaskType.COMPLETION, input);
var events = streamInferOnMockService(modelId, TaskType.COMPLETION, input, VALIDATE_ELASTIC_PRODUCT_HEADER_CONSUMER);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, even if I remove my change this test still passes. I opted for keeping these tests just in case something really goes wrong in the future but they wouldn't have caught the original issue 😞

@@ -115,6 +121,12 @@ private void initializeStream(InferenceAction.Response response) {
)
);
});

nextBodyPartListener = ContextPreservingActionListener.wrapPreservingContext(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I manually test this, it fixes the issue. I always get the header back as expected. Without the fix the _stream api usually does not return the header. I believe this is because we are create a new listener that maybe gets executed with a different thread context. I think if we responded in the onResponse implementation of this class it'd probably avoid the issue. But that's not how this was designed so I'm keeping the structure as it was.

@jonathan-buttner jonathan-buttner marked this pull request as ready for review December 19, 2024 17:33
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@maxhniebergall maxhniebergall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for finding this fix for this!

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jonathan-buttner jonathan-buttner merged commit 7ba3cb9 into elastic:main Dec 23, 2024
16 checks passed
@jonathan-buttner jonathan-buttner deleted the ml-streaming-lost-context branch December 23, 2024 12:57
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Dec 23, 2024
…stic#118999)

* Adding context preserving fix

* Update docs/changelog/118999.yaml

* Update docs/changelog/118999.yaml

* Using a setonce and adding a test

* Updating the changelog

(cherry picked from commit 7ba3cb9)

# Conflicts:
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Dec 23, 2024
…stic#118999)

* Adding context preserving fix

* Update docs/changelog/118999.yaml

* Update docs/changelog/118999.yaml

* Using a setonce and adding a test

* Updating the changelog

(cherry picked from commit 7ba3cb9)

# Conflicts:
#	x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceBaseRestTest.java
#	x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceCrudIT.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rest/RestUnifiedCompletionInferenceAction.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/RestUnifiedCompletionInferenceActionTests.java
@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x
8.17
8.16

Questions ?

Please refer to the Backport tool documentation

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Dec 23, 2024
…stic#118999)

* Adding context preserving fix

* Update docs/changelog/118999.yaml

* Update docs/changelog/118999.yaml

* Using a setonce and adding a test

* Updating the changelog

(cherry picked from commit 7ba3cb9)

# Conflicts:
#	x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceBaseRestTest.java
#	x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceCrudIT.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rest/RestUnifiedCompletionInferenceAction.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/RestUnifiedCompletionInferenceActionTests.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team v8.16.3 v8.17.1 v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] [Inference API] The _stream API doesn't consistently return the elastic product header
4 participants