-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obligation to hardcode full path in local Delta Lake #555
Comments
Can confirm. Encountered this issue. It wasn't fun debugging, I kept thinking the error had something to do with my local spark shell configuration. Quite annoying when trying to tweak various configurations via |
Do we have any update on this? |
This seems like a good start task if anyone wants to pick it up! |
Do we know if any particular reason for the existing restriction? Relatives path wouldn't cause issues for queries like: SELECT * FROM delta.myrelativepath? It could be both, a delta table at myrelativepath, or a table called "myrelativepath" in the delta database |
Linking this to #1572 |
* [FlinkSQL_PR_1] Flink Delta Sink - Table API UPDATED (delta-io#389) Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: Paweł Kubit <[email protected]> Co-authored-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_2] - SQL Support for Delta Source connector. (delta-io#487) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_3] - Delta catalog skeleton (delta-io#503) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_4] - Delta catalog - Interactions with DeltaLog. Create and get table. (delta-io#506) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_5] - Delta catalog - DDL option validation. (delta-io#509) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_6] - Delta catalog - alter table + tests. (delta-io#510) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_7] - Delta catalog - Restrict Delta Table factory to work only with Delta Catalog + tests. (delta-io#514) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_8] - Delta Catalog - DDL/Query hint validation + tests. (delta-io#520) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_9] - Delta Catalog - Adding Flink's Hive catalog as decorated catalog. (delta-io#524) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. (delta-io#528) * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. --------- Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: Scott Sandre <[email protected]> * [FlinkSQL_PR_11] - Delta Catalog - cache DeltaLog instances in DeltaCatalog. (delta-io#529) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_12] - UML diagrams. (delta-io#530) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_13] - Remove mergeSchema option from SQL API. (delta-io#531) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_14] - SQL examples. (delta-io#535) Signed-off-by: Krzysztof Chmielewski <[email protected]> * remove duplicate function after rebasing against master --------- Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: kristoffSC <[email protected]> Co-authored-by: Paweł Kubit <[email protected]> Co-authored-by: Krzysztof Chmielewski <[email protected]>
This reverts commit e036171.
* [FlinkSQL_PR_1] Flink Delta Sink - Table API UPDATED (delta-io#389) Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: Paweł Kubit <[email protected]> Co-authored-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_2] - SQL Support for Delta Source connector. (delta-io#487) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_3] - Delta catalog skeleton (delta-io#503) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_4] - Delta catalog - Interactions with DeltaLog. Create and get table. (delta-io#506) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_5] - Delta catalog - DDL option validation. (delta-io#509) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_6] - Delta catalog - alter table + tests. (delta-io#510) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_7] - Delta catalog - Restrict Delta Table factory to work only with Delta Catalog + tests. (delta-io#514) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_8] - Delta Catalog - DDL/Query hint validation + tests. (delta-io#520) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_9] - Delta Catalog - Adding Flink's Hive catalog as decorated catalog. (delta-io#524) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. (delta-io#528) * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. --------- Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: Scott Sandre <[email protected]> * [FlinkSQL_PR_11] - Delta Catalog - cache DeltaLog instances in DeltaCatalog. (delta-io#529) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_12] - UML diagrams. (delta-io#530) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_13] - Remove mergeSchema option from SQL API. (delta-io#531) Signed-off-by: Krzysztof Chmielewski <[email protected]> * [FlinkSQL_PR_14] - SQL examples. (delta-io#535) Signed-off-by: Krzysztof Chmielewski <[email protected]> * remove duplicate function after rebasing against master --------- Signed-off-by: Krzysztof Chmielewski <[email protected]> Signed-off-by: Krzysztof Chmielewski <[email protected]> Co-authored-by: kristoffSC <[email protected]> Co-authored-by: Paweł Kubit <[email protected]> Co-authored-by: Krzysztof Chmielewski <[email protected]>
From what I undersetand, we use absolute paths because that's the only way to disambiguate the SQL grammar for commands like VACUUM that explicitly recognize path as different from identifier while lexing (one has a leading |
Hello! I found out that you cannot query a local Delta Lake using SQL directly from the files if you don't use the full path (from root to the Delta Lake directory). I know that Delta Lakes are not supposed to be used in standalone clusters or non-distributed file system but for testing and (perhaps in the future, general public reachment) it's not a bad idea to generate a Delta Lake in a local directory to try it out when you are working in Visual Studio Code or another IDE, that change the working directory to the folder that you have opened and let you use 'relative paths'.
If you try to use data/delta-test (example) and in a PySpark session you try to make a query on that delta-lake, you get:
`pyspark.sql.utils.AnalysisException: Unsupported data source type for direct query on files: delta;;
But if you do /home/gabmartini/data/delta-test it works.
Very frustrating for beginners to be truly honest. Perhaps a very explicit remark in the documentation will help to clear that out.
Thanks and keep up the good work!
The text was updated successfully, but these errors were encountered: