-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index_options to semantic_text field mappings #119967
base: main
Are you sure you want to change the base?
Add index_options to semantic_text field mappings #119967
Conversation
e096a61
to
342d769
Compare
342d769
to
d822301
Compare
Hi @kderusso, I've created a changelog YAML for you. |
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/search-relevance (Team:Search - Relevance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start to this! I have a bunch of comments, but they're mostly interrelated, so it's not as much as it seems.
@@ -0,0 +1,5 @@ | |||
pr: 119967 | |||
summary: Add `index_options` to `semantic_text` field mappings | |||
area: Relevance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping
is probably a more appropriate area for this change
For dense vector models, the configuration of `index_options` models the <<dense-vector-index-options>> configuration and defaults for dense vectors. | ||
There are currently no supported index options for sparse models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For dense vector models, the configuration of `index_options` models the <<dense-vector-index-options>> configuration and defaults for dense vectors. | |
There are currently no supported index options for sparse models. | |
For dense vector models, the configuration of `index_options` specifies the <<dense-vector-index-options>> configuration and defaults for dense vectors. | |
There are currently no supported index options for sparse vector models. |
try { | ||
builder.field(key, value); | ||
} catch (IOException e) { | ||
throw new UncheckedIOException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why you're doing this, but it would be nice if we didn't have to use UncheckedIOException
here. The method already throws IOException
. Can we iterate over the map without using forEach
so that we can throw IOException
directly?
return asMap().toString(); | ||
} | ||
|
||
public abstract Map<String, Object> asMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we implement asMap()
in the base class and handle type
in there? We could have an abstract asMap(Map<String, Object> map)
method for writing the rest of the params that vary by implementation.
@@ -186,6 +194,101 @@ private void validateFieldNotPresent(String field, Object fieldValue) { | |||
} | |||
} | |||
|
|||
public abstract static class IndexOptions implements ToXContentObject { | |||
|
|||
protected final String type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this private
?
test_runner_features: [ capabilities ] | ||
capabilities: | ||
- method: GET | ||
path: /_inference | ||
capabilities: [ default_elser_2 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
- do: | ||
indices.create: | ||
index: test-index-options | ||
body: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to specify the index setting to use the new format for this test:
settings:
index:
mapping:
semantic_text:
use_legacy_format: false
- match: { "test-index-options.mappings.properties.semantic_field.index_options.m": 16 } | ||
- match: { "test-index-options.mappings.properties.semantic_field.index_options.ef_construction": 100 } | ||
- match: { "test-index-options.mappings.properties.semantic_field.index_options.confidence_interval": 1.0 } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you index a doc with dense vectors here to test that the index options apply successfully? You'll also need to specify the use_legacy_format
index setting to use the new format when doing so.
- not_exists: test-index-options.mappings.properties.semantic_field.index_options.m | ||
- not_exists: test-index-options.mappings.properties.semantic_field.index_options.ef_construction | ||
- not_exists: test-index-options.mappings.properties.semantic_field.index_options.confidence_interval | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, can you index a doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the tests that rely on the semantic text format to index docs (Supports index options
, Supports partial index options
, Index options on sparse_vector fields are ignored
), can you replicate them in 10_semantic_text_field_mapping_bwc.yml
using the legacy format?
Adds
index_options
support forsemantic_text
fields using dense models.Example: