You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The input schema and schema for projection are different in HashJoin cases because we might add additional alias expression for hash join. Since the schema has no additional alias column that the input schema has, it is possible to cause schema mismatch. In stats_projection(), because we ignore the error so we don't encounter such issue but I think we should not eat the error but fix the schema and column index matching issue instead.
If we use the schema for projection, we have the column mismatch index that could cause Column's bounds_check.
The error is because of the out of bound index. sid is created as index 4 for input_schema but the schema we used in projection has only 4 columns.
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/projection.rs:102:44:
called `Result::unwrap()` on an `Err` value: Internal("PhysicalExpr Column references column 'sid' at index 4 (zero-based) but input schema only has 4 columns: [\"name\", \"id\", \"product\", \"sid\"]")
Expected behavior
I think we might either recreate expressions with the new index that match the schema or use the input_schema for projection. I'm not sure which is the right choice yet.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Describe the bug
The input schema and schema for projection are different in HashJoin cases because we might add additional alias expression for hash join. Since the schema has no additional alias column that the
input schema
has, it is possible to cause schema mismatch. Instats_projection()
, because we ignore the error so we don't encounter such issue but I think we should not eat the error but fix the schema and column index matching issue instead.datafusion/datafusion/physical-plan/src/projection.rs
Lines 278 to 283 in 2464703
To Reproduce
Take this query for example,
If we add the code to
ProjectExec::try_new()
we can see that
input_schema
has additional columnCAST(t1.id AS Int64)
If we use the
schema
for projection, we have the column mismatch index that could cause Column'sbounds_check
.The error is because of the out of bound index.
sid
is created as index 4 forinput_schema
but the schema we used in projection has only 4 columns.Expected behavior
I think we might either recreate expressions with the new index that match the
schema
or use theinput_schema
for projection. I'm not sure which is the right choice yet.Additional context
No response
The text was updated successfully, but these errors were encountered: