-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(substrait): remove dependency on datafusion default features #13594
feat(substrait): remove dependency on datafusion default features #13594
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @notfilippo -- this looks like a good change to me.
I wonder if there is some way we can add a CI test for this (to avoid breaking it in the future) -- perhaps something like
datafusion/.github/workflows/rust.yml
Line 81 in ddee471
run: cargo check --all-targets --no-default-features -p datafusion-common |
I think we need to
index 50bebc5b4..0b2338cd3 100644
--- a/.github/workflows/rust.yml
+++ b/.github/workflows/rust.yml
@@ -80,9 +80,12 @@ jobs:
- name: Check datafusion-common without default features
run: cargo check --all-targets --no-default-features -p datafusion-common
- - name: Check datafusion-functions
+ - name: Check datafusion-functions without default features
run: cargo check --all-targets --no-default-features -p datafusion-functions
+ - name: Check datafusion-substrait without default features
+ run: cargo check --all-targets --no-default-features -p datafusion-substrait
+
- name: Check workspace in debug mode
run: cargo check --all-targets --workspace
I tested with andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ cargo check --all-targets --no-default-features -p datafusion-substrait
warning: function `coerce_file_schema_to_view_type` is never used
--> datafusion/core/src/datasource/file_format/mod.rs:428:15
|
428 | pub(crate) fn coerce_file_schema_to_view_type(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: function `coerce_file_schema_to_string_type` is never used
--> datafusion/core/src/datasource/file_format/mod.rs:492:15
|
492 | pub(crate) fn coerce_file_schema_to_string_type(
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
warning: `datafusion` (lib) generated 2 warnings
Checking datafusion-substrait v43.0.0 (/Users/andrewlamb/Software/datafusion/datafusion/substrait)
error[E0432]: unresolved import `datafusion::datasource::physical_plan::ParquetExec`
--> datafusion/substrait/tests/cases/roundtrip_physical_plan.rs:25:61
|
25 | use datafusion::datasource::physical_plan::{FileScanConfig, ParquetExec};
| ^^^^^^^^^^^ no `ParquetExec` in `datasource::physical_plan`
|
note: found an item that was configured out
--> /Users/andrewlamb/Software/datafusion/datafusion/core/src/datasource/physical_plan/mod.rs:34:25
|
34 | pub use self::parquet::{ParquetExec, ParquetFileMetrics, ParquetFileReaderFactory};
| ^^^^^^^^^^^
note: the item is gated behind the `parquet` feature
--> /Users/andrewlamb/Software/datafusion/datafusion/core/src/datasource/physical_plan/mod.rs:33:7
|
33 | #[cfg(feature = "parquet")]
| ^^^^^^^^^^^^^^^^^^^
error[E0432]: unresolved import `datafusion_substrait::physical_plan`
--> datafusion/substrait/tests/cases/roundtrip_physical_plan.rs:29:27
|
29 | use datafusion_substrait::physical_plan::{consumer, producer};
| ^^^^^^^^^^^^^ could not find `physical_plan` in `datafusion_substrait`
|
note: found an item that was configured out
--> /Users/andrewlamb/Software/datafusion/datafusion/substrait/src/lib.rs:79:9
|
79 | pub mod physical_plan;
| ^^^^^^^^^^^^^
note: the item is gated behind the `physical` feature
--> /Users/andrewlamb/Software/datafusion/datafusion/substrait/src/lib.rs:78:7
|
78 | #[cfg(feature = "physical")]
| ^^^^^^^^^^^^^^^^^^^^
error[E0425]: cannot find function `parquet_test_data` in module `datafusion::test_util`
--> datafusion/substrait/tests/cases/roundtrip_physical_plan.rs:154:43
|
154 | let testdata = datafusion::test_util::parquet_test_data();
| ^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `arrow_test_data`
|
::: /Users/andrewlamb/Software/datafusion/datafusion/common/src/test_util.rs:201:1
|
201 | pub fn arrow_test_data() -> String {
| ---------------------------------- similarly named function `arrow_test_data` defined here
|
note: found an item that was configured out
--> /Users/andrewlamb/Software/datafusion/datafusion/core/src/test_util/mod.rs:60:39
|
60 | pub use datafusion_common::test_util::parquet_test_data;
| ^^^^^^^^^^^^^^^^^
note: the item is gated behind the `parquet` feature
--> /Users/andrewlamb/Software/datafusion/datafusion/core/src/test_util/mod.rs:59:7
|
59 | #[cfg(feature = "parquet")]
| ^^^^^^^^^^^^^^^^^^^
error[E0599]: no method named `register_parquet` found for struct `datafusion::prelude::SessionContext` in the current scope
--> datafusion/substrait/tests/cases/roundtrip_physical_plan.rs:145:9
|
145 | ctx.register_parquet("data", "tests/testdata/data.parquet", explicit_options)
| ^^^^^^^^^^^^^^^^
|
help: there is a method `register_arrow` with a similar name
|
145 | ctx.register_arrow("data", "tests/testdata/data.parquet", explicit_options)
| ~~~~~~~~~~~~~~
error[E0599]: no method named `register_parquet` found for struct `datafusion::prelude::SessionContext` in the current scope
--> datafusion/substrait/tests/cases/roundtrip_physical_plan.rs:155:9
|
155 | ctx.register_parquet(
| ----^^^^^^^^^^^^^^^^
|
help: there is a method `register_arrow` with a similar name
|
155 | ctx.register_arrow(
| ~~~~~~~~~~~~~~
Some errors have detailed explanations: E0425, E0432, E0599.
For more information about an error, try `rustc --explain E0425`.
error: could not compile `datafusion-substrait` (test "substrait_integration") due to 5 previous errors
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ |
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
Thanks @alamb – Weirdly enough I didn't receive any pings from GitHub about activity on this issue / PR... Anyway thanks for pointing out the issue with the roundtrip test. The fix I posted should be enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @notfilippo -- makes sense to me
Thanks again @notfilippo |
Which issue does this PR close?
Closes #13593
What changes are included in this PR?
physical
feature, which enables production and consumption of physical substrait plans.Are these changes tested?
Yes, running
cargo tree -p datafusion-substrait
with and without the--no-default-features
shows thatparquet
and many other dependencies are not included.Are there any user-facing changes?
No, the new feature is enabled by default. Users who might want to benefit from this change will have to use
default-features = false
when including thedatafusion-substrait
crate.