Support unparsing `LogicalPlan::Extension` to SQL tesxt #13753

goldmedal · 2024-12-13T09:28:31Z

Is your feature request related to a problem or challenge?

LogicalPlan::Extension allows the user to implement their custom logical plan. I think it makes sense to allow to define custom unparsing behavior for it.

Describe the solution you'd like

I would like to introduce a new method for UserDefinedLogicalNode

    /// User-defined nodes can override this method to provide a custom
    /// implementation for the unparser.
    fn unparse(
        &self,
        unparser: &Unparser,
        query: &mut Option<QueryBuilder>,
        select: &mut Option<SelectBuilder>,
        relation: &mut Option<RelationBuilder>,
        table_with_joins: &mut Option<TableWithJoinsBuilder>,
    ) -> Result<()> {
        not_impl_err!("custom unparsing not implemented")
    }

Then, we can handle LogicalPlan::Extension in the unparser like

    LogicalPlan::Extension(extension) => {
        extension.node.unparse(self, plan, query, Some(select), Some(relation), None)
    },

However, the required builders haven't been made public yet. UserDefinedLogicalNode can't access the builders.
As per the comment in ast.rs, we planned to move builders to sqlparser-rs.

datafusion/datafusion/sql/src/unparser/ast.rs

Lines 18 to 22 in 6ac1999

    
           //! This file contains builders to create SQL ASTs. They are purposefully 
        
           //! not exported as they will eventually be move to the SQLparser package. 
        
           //! 
        
           //! 
        
           //! See <https://github.com/apache/datafusion/issues/8661>

I'm contemplating whether this move is required. When implementing a new unparsing feature, we may need to add some additional helper functions to the builder (e.g., SelectBuilder::already_projection). If we move it to the upstream crate, it would be difficult to add helper functions when required.

I prefer to move builders to datafusion-common and make them public for the user and datafusion-expr, where the UserDefinedLogicalNode is located.

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

goldmedal · 2024-12-13T09:30:13Z

@alamb @phillipleblanc @sgrebnov @jayzhan211 @findepi
What do you think?

jayzhan211 · 2024-12-13T11:25:24Z

I would said we made it public in datafusion-sql (or upstream to sql-parser) but not moved to datafusion-common. I think we should eliminate dependency to sql-parser in datafusion-common, we need to move those functions or structs that have dependency to sql-parser out of datafusion-common to datafusion-sql

alamb · 2024-12-13T12:39:32Z

I would said we made it public in datafusion-sql (or upstream to sql-parser) but not moved to datafusion-common. I think we should eliminate dependency to sql-parser in datafusion-common, we need to move those functions or structs that have dependency to sql-parser out of datafusion-common to datafusion-sql

I agree that making the builders / etc public in datafusion-sql is better than dumping more stuff in datafusion-common

Another potential thought might be to move the unparsing code into its own crate now given its non trivial complexity now. For example datafusion-unparser

Another potential approach that would avoid adding a dependency on UserDefinedLogicalNode(which is in datafusion-expr) on the unparser / sqlparser might be:

Create a new Unparseable trait
Have the unparser try and downcast the UserDefinedLogicalNode

So something like

pub trait Unparseable {
    /// User-defined nodes can override this method to provide a custom
    /// implementation for the unparser.
    fn unparse(
        &self,
        unparser: &Unparser,
        query: &mut Option<QueryBuilder>,
        select: &mut Option<SelectBuilder>,
        relation: &mut Option<RelationBuilder>,
        table_with_joins: &mut Option<TableWithJoinsBuilder>,
    ) -> Result<()> {
        not_impl_err!("custom unparsing not implemented")
    }
}

And then the unparser could do something like:

let user_defined_local_node: dyn &UserDefinedLogicalNode = ...;
// is the user defined node unparseable?
let Some(unparseable) = user_defined_local_node
  .as_any()
  .downcast_ref::<Unparseable>() else {
  return plan_err!("Node type {} does not implement Unparseable", user_defined_local_node.name())
}

let sql_nodes = unparseable.unparse(unparser, ....)

🤔

phillipleblanc · 2024-12-13T13:54:54Z

AFAIK Rust doesn't allow downcasting from a dyn object to a trait (i.e. Unparseable) - it needs to be the concrete type.

Creating a new datafusion-unparser crate makes sense to me, we could make the builders public there. However, having a dependency from datafusion-expr to datafusion-unparser still feels a bit weird (and also not possible due to circular dependencies?). Perhaps instead of adding an unparse function on UserDefinedLogicalNode, we could add a registry on the Unparser object that takes UserDefinedLogicalNode and tries to unparse it, similar to how the optimizer rules work.

i.e.

pub trait UserDefinedLogicalNodeUnparser {
    fn unparse(
        &self,
        node: &dyn UserDefinedLogicalNode,
        query: &mut Option<QueryBuilder>,
        select: &mut Option<SelectBuilder>,
        relation: &mut Option<RelationBuilder>,
        table_with_joins: &mut Option<TableWithJoinsBuilder>,
    ) -> Result<()>;
}

And then the Unparser has a Vec<Arc<dyn UserDefinedLogicalNodeUnparser>> that it uses to try to unparse UserDefinedLogicalNode

alamb · 2024-12-13T18:00:24Z

AFAIK Rust doesn't allow downcasting from a dyn object to a trait (i.e. Unparseable) - it needs to be the concrete type.

I think it my example unparseable would be &dyn Unparseable 👍

let user_defined_local_node: dyn &UserDefinedLogicalNode = ...;
// is the user defined node unparseable?
let Some(unparseable) = user_defined_local_node. // <---- this is `&dyn Unparseable` I think 
  .as_any()
  .downcast_ref::<Unparseable>() else {
  return plan_err!("Node type {} does not implement Unparseable", user_defined_local_node.name())
}

let sql_nodes = unparseable.unparse(unparser, ....)

phillipleblanc · 2024-12-16T02:01:10Z

I tried playing around with this in the Rust playground and wasn't able to get it to downcast properly: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=96cc916e29db395f4ec61374d9a2ce33

use std::any::Any;

trait UserDefinedLogicalNode: Any {
    fn as_any(&self) -> &dyn Any;
}

trait Unparseable: Any {
    fn as_any(&self) -> &dyn Any;
}

struct MyCustomNode {}

impl UserDefinedLogicalNode for MyCustomNode {
    fn as_any(&self) -> &dyn Any { self }
}

impl Unparseable for MyCustomNode {
    fn as_any(&self) -> &dyn Any { self }
}

fn main() {
    let s = MyCustomNode {};
    let user_defined_local_node: &dyn UserDefinedLogicalNode = &s;
    
    // This fails to compile - "error[E0782]: expected a type, found a trait"
    // replacing with `.downcast_ref::<dyn Unparseable>()` fails to compile due to it being unsized
    // and replacing with `.downcast_ref::<&dyn Unparseable>()` compiles but doesn't work
    if let Some(_) = user_defined_local_node.as_any().downcast_ref::<Unparseable>() {
        println!("Downcast worked");
    } else {
        println!("Downcast didn't work")
    }
}

goldmedal · 2024-12-16T07:12:46Z

I summarized the proposal dependency, it will be like this

As @phillipleblanc said, Rust can't downcast to a dyn trait. I feel his proposal is good if we don't want to add the method to UserDefinedLogicalNode directly.

And then the Unparser has a Vec<Arc> that it uses to try to unparse UserDefinedLogicalNode

I guess this idea is similar to ExprPlanner for the logical planner.

datafusion/datafusion/expr/src/planner.rs

Lines 97 to 98 in bd2c975

    
           /// This trait allows users to customize the behavior of the SQL planner 
        
           pub trait ExprPlanner: Debug + Send + Sync {

So, datafusion-unparser will provide the builders and the trait, UserDefinedLogicalNodeUnparser.

alamb · 2024-12-16T15:18:52Z

Sounds like we have a plan!

goldmedal · 2024-12-22T10:14:58Z

I guess this idea is similar to ExprPlanner for the logical planner.

datafusion/datafusion/expr/src/planner.rs

Lines 97 to 98 in bd2c975

/// This trait allows users to customize the behavior of the SQL planner

pub trait ExprPlanner: Debug + Send + Sync {

So, datafusion-unparser will provide the builders and the trait, UserDefinedLogicalNodeUnparser.

I followed the design proposed by @phillipleblanc in #13880. I found we don't need to expose the builders for datafusion-expr. We can just public them in datafusion-sql for the user directly. So datafusion-unparser isn't required anymore.

goldmedal added the enhancement New feature or request label Dec 13, 2024

goldmedal self-assigned this Dec 22, 2024

goldmedal mentioned this issue Dec 22, 2024

Introduce UserDefinedLogicalNodeUnparser for User-defined Logical Plan unparsing #13880

Merged

goldmedal closed this as completed in #13880 Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support unparsing `LogicalPlan::Extension` to SQL tesxt #13753

Support unparsing `LogicalPlan::Extension` to SQL tesxt #13753

goldmedal commented Dec 13, 2024

goldmedal commented Dec 13, 2024

jayzhan211 commented Dec 13, 2024

alamb commented Dec 13, 2024

phillipleblanc commented Dec 13, 2024 •

edited

Loading

alamb commented Dec 13, 2024

phillipleblanc commented Dec 16, 2024

goldmedal commented Dec 16, 2024

alamb commented Dec 16, 2024

goldmedal commented Dec 22, 2024

Support unparsing LogicalPlan::Extension to SQL tesxt #13753

Support unparsing LogicalPlan::Extension to SQL tesxt #13753

Comments

goldmedal commented Dec 13, 2024

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

goldmedal commented Dec 13, 2024

jayzhan211 commented Dec 13, 2024

alamb commented Dec 13, 2024

phillipleblanc commented Dec 13, 2024 • edited Loading

alamb commented Dec 13, 2024

phillipleblanc commented Dec 16, 2024

goldmedal commented Dec 16, 2024

alamb commented Dec 16, 2024

goldmedal commented Dec 22, 2024

Support unparsing `LogicalPlan::Extension` to SQL tesxt #13753

Support unparsing `LogicalPlan::Extension` to SQL tesxt #13753

phillipleblanc commented Dec 13, 2024 •

edited

Loading