-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not serialize EsIndex in plan #119580
base: main
Are you sure you want to change the base?
Do not serialize EsIndex in plan #119580
Conversation
Hi @idegtiarenko, I've created a changelog YAML for you. |
@@ -2644,7 +2644,6 @@ private void assertEmptyEsRelation(LogicalPlan plan) { | |||
assertThat(plan, instanceOf(EsRelation.class)); | |||
EsRelation esRelation = (EsRelation) plan; | |||
assertThat(esRelation.output(), equalTo(NO_FIELDS)); | |||
assertTrue(esRelation.index().mapping().isEmpty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is possible to replace this assertion in assertEmptyEsRelation
with any alternative?
Pinging @elastic/es-analytical-engine (Team:Analytics) |
# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/EsRelation.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/physical/EsQueryExec.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/physical/EsSourceExec.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find - left a question around the usage of IndexModes map; not clear why that has to be serialized?
IndexMode indexMode, | ||
Map<String, IndexMode> indexNameWithModes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this map needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in LocalExecutionPlanner:
Lines 564 to 571 in 2ff1d76
Map<String, IndexMode> indicesWithModes = localSourceExec.indexNameWithModes(); | |
if (indicesWithModes.size() != 1) { | |
throw new IllegalArgumentException("can't plan [" + join + "], found more than 1 index"); | |
} | |
var entry = indicesWithModes.entrySet().iterator().next(); | |
if (entry.getValue() != IndexMode.LOOKUP) { | |
throw new IllegalArgumentException("can't plan [" + join + "], found index with mode [" + entry.getValue() + "]"); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only thing that requires that: I believe this check could happen much earlier. Potentially already in the analyzer on the coordinator node. We shouldn't need to wait until the local planning to determine that an index isn't a lookup index.
Could you maybe check if it's feasible to move that to an earlier place and, thus, remove the indexNameWithModes
map altogether from the EsRelation
and related classes?
Happy to assist with moving that to an earlier stage!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There already is an index mode check in the Analyzer:
elasticsearch/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Lines 239 to 255 in 4ac8d55
if (plan.indexMode().equals(IndexMode.LOOKUP)) { | |
String indexResolutionMessage = null; | |
var indexNameWithModes = esIndex.indexNameWithModes(); | |
if (indexNameWithModes.size() != 1) { | |
indexResolutionMessage = "invalid [" | |
+ table | |
+ "] resolution in lookup mode to [" | |
+ indexNameWithModes.size() | |
+ "] indices"; | |
} else if (indexNameWithModes.values().iterator().next() != IndexMode.LOOKUP) { | |
indexResolutionMessage = "invalid [" | |
+ table | |
+ "] resolution in lookup mode to an index in [" | |
+ indexNameWithModes.values().iterator().next() | |
+ "] mode"; | |
} |
private final EsIndex index; | ||
private final String indexName; | ||
private final IndexMode indexMode; | ||
private final Map<String, IndexMode> indexNameWithModes; | ||
private final List<Attribute> attrs; | ||
private final boolean frozen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frozen is no longer needed - since you're refactoring the classes, please remove this field from the Es classes.
# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
@@ -126,12 +129,13 @@ public void testDeeplyNestedFields() throws IOException { | |||
* with a single root field that has many children, grandchildren etc. | |||
*/ | |||
public void testDeeplyNestedFieldsKeepOnlyOne() throws IOException { | |||
ByteSizeValue expected = ByteSizeValue.ofBytes(9425804); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test file beautifully demonstrates the improvements from the change. Awesome, in this particularly bad case we got a reduction by 99.996%!
public static final TransportVersion ESQL_SKIP_ES_INDEX_SERIALIZATION = def(8_823_00_0); | ||
public static final TransportVersion ESQL_REMOVE_ES_RELATION_FROZEN = def(8_824_00_0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: couldn't that be a single new transport version?
IndexMode indexMode, | ||
Map<String, IndexMode> indexNameWithModes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only thing that requires that: I believe this check could happen much earlier. Potentially already in the analyzer on the coordinator node. We shouldn't need to wait until the local planning to determine that an index isn't a lookup index.
Could you maybe check if it's feasible to move that to an earlier place and, thus, remove the indexNameWithModes
map altogether from the EsRelation
and related classes?
Happy to assist with moving that to an earlier stage!
Certain plan classes (such as EsRelation, EsSourceExec, EsQueryExec) contain and serialize the entire
EsIndex
instance.This instance might contain huge
mapping
that is never used in plan. This change replacesEsIndex
usage withname
andindexNameWithModes
to minimize the size of the serialized plan.Closes: #112998