Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: investigate Exceptions unwrapping in API #119692

Closed
bpintea opened this issue Jan 7, 2025 · 2 comments · Fixed by #120164
Closed

ESQL: investigate Exceptions unwrapping in API #119692

bpintea opened this issue Jan 7, 2025 · 2 comments · Fixed by #120164
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@bpintea
Copy link
Contributor

bpintea commented Jan 7, 2025

Description

ESQL wraps currently exceptions bubbled from downstream (afte change). We should look into ways of reverting this behavior, which would allow more accurate monitoring of the infrastructure and better pinpointing of failures.

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@bpintea
Copy link
Contributor Author

bpintea commented Jan 14, 2025

After further consideration, the wrapping should remain in place, since it can be the only logged trace of an ESQL request failure. Without it, one would need to run ES on debug logging to get an indication of what's wrong.
The log message produced by the listener could be modified to make it a bit more obvious that the failure is handled by it. But otherwise, the logging is as complete as Logger#log(..., Throwable) can make it.

For a NoShardAvailableActionException failure, the listener is fed by fieldcaps with an instantiated exception, so there is practically no stack trace:

{
    "error": {
        "root_cause": [
            {
                "type": "no_shard_available_action_exception",
                "reason": "index [test2] has no active shard copy",
                "stack_trace": "org.elasticsearch.action.NoShardAvailableActionException: index [test2] has no active shard copy\n"
            }
        ],
        "type": "no_shard_available_action_exception",
        "reason": "index [test2] has no active shard copy",
        "stack_trace": "org.elasticsearch.action.NoShardAvailableActionException: index [test2] has no active shard copy\n"
    },
    "status": 503
}

producing a log entry:

[2025-01-14T18:00:34,700][WARN ][o.e.x.e.a.EsqlResponseListener] [runTask-0] Request failed with status [SERVICE_UNAVAILABLE]: org.elasticsearch.action.NoShardAvailableActionException: index [test2] has no active shard copy

But as a different example, a bug such as the query in #119950, produces a more verbose log entry:

[2025-01-14T23:49:50,082][DEBUG][o.e.x.e.a.EsqlResponseListener] [runTask-0] Request failed with status [BAD_REQUEST]: [test/mbbAcolTQ-OjXSImKW4AWg] org.elasticsearch.index.query.QueryShardException: failed to create query
: Cannot invoke "Object.toString()" because "value" is null
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:567)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders$DefaultShardContext.toQuery(EsPhysicalOperationProviders.java:293)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders.lambda$querySupplier$2(EsPhysicalOperationProviders.java:154)
        at org.elasticsearch.compute.lucene.LuceneOperator.lambda$weightFunction$0(LuceneOperator.java:440)
        at org.elasticsearch.compute.lucene.LuceneSliceQueue.create(LuceneSliceQueue.java:66)
        at org.elasticsearch.compute.lucene.LuceneOperator$Factory.<init>(LuceneOperator.java:102)
        at org.elasticsearch.compute.lucene.LuceneSourceOperator$Factory.<init>(LuceneSourceOperator.java:59)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders.sourcePhysicalOperation(EsPhysicalOperationProviders.java:198)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planEsQueryNode(LocalExecutionPlanner.java:252)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:218)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planFieldExtractNode(LocalExecutionPlanner.java:277)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:196)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planProject(LocalExecutionPlanner.java:621)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:208)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planExchangeSink(LocalExecutionPlanner.java:326)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:240)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:179)
        at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:472)
        at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestExecutor.lambda$runBatch$1(ComputeService.java:719)
        at [email protected]/org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257)
        at [email protected]/org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:58)
        at [email protected]/org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:55)
        at [email protected]/org.elasticsearch.action.ActionRunnable$4.doRun(ActionRunnable.java:101)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
        at org.elasticsearch.xpack.esql.plugin.ComputeService.lambda$acquireSearchContexts$19(ComputeService.java:552)
        at [email protected]/org.elasticsearch.core.AbstractRefCounted$1.closeInternal(AbstractRefCounted.java:125)
        at [email protected]/org.elasticsearch.core.AbstractRefCounted.decRef(AbstractRefCounted.java:77)
        at [email protected]/org.elasticsearch.action.support.RefCountingRunnable.close(RefCountingRunnable.java:113)
        at org.elasticsearch.xpack.esql.plugin.ComputeService.acquireSearchContexts(ComputeService.java:565)
        at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestExecutor.runBatch(ComputeService.java:716)
        at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestExecutor.start(ComputeService.java:686)
        at org.elasticsearch.xpack.esql.plugin.ComputeService.runComputeOnDataNode(ComputeService.java:755)
        at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestHandler.messageReceived(ComputeService.java:827)
        at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestHandler.messageReceived(ComputeService.java:803)
        at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:579)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
        at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:632)
        at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:621)
        at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:640)
        at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
        at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.FieldAndDocumentLevelSecurityRequestInterceptor.intercept(FieldAndDocumentLevelSecurityRequestInterceptor.java:79)
        at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.ValidateRequestInterceptor.intercept(ValidateRequestInterceptor.java:20)
        at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
        at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
        at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.BulkShardRequestInterceptor.intercept(BulkShardRequestInterceptor.java:85)
        at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
....
Caused by: java.lang.NullPointerException: Cannot invoke "Object.toString()" because "value" is null
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType.objectToDouble(NumberFieldMapper.java:1493)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType.objectToLong(NumberFieldMapper.java:1508)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:1217)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.termsQuery(NumberFieldMapper.java:1262)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberFieldType.termsQuery(NumberFieldMapper.java:1733)
        at [email protected]/org.elasticsearch.index.query.TermsQueryBuilder.doToQuery(TermsQueryBuilder.java:347)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:119)
        at org.elasticsearch.xpack.esql.querydsl.query.SingleValueQuery$Builder.doToQuery(SingleValueQuery.java:187)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:119)
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:559)

bpintea added a commit that referenced this issue Jan 16, 2025
This applies minor logging changes removing some duplication and making it obvious in the log message that the entry is produced by ESQL (EsqlResponseListener).

It also removes some code duplication in the way query actions produce the RestChannelConsumer.

Closes #119692.
bpintea added a commit to bpintea/elasticsearch that referenced this issue Jan 16, 2025
This applies minor logging changes removing some duplication and making it obvious in the log message that the entry is produced by ESQL (EsqlResponseListener).

It also removes some code duplication in the way query actions produce the RestChannelConsumer.

Closes elastic#119692.
elasticsearchmachine pushed a commit that referenced this issue Jan 16, 2025
This applies minor logging changes removing some duplication and making it obvious in the log message that the entry is produced by ESQL (EsqlResponseListener).

It also removes some code duplication in the way query actions produce the RestChannelConsumer.

Closes #119692.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants