Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The database schema used when generating SQL #34

Open
zhangnianlong opened this issue Jan 6, 2025 · 2 comments
Open

The database schema used when generating SQL #34

zhangnianlong opened this issue Jan 6, 2025 · 2 comments

Comments

@zhangnianlong
Copy link

Hello, may I ask if the database schema used for generating SQL should not be the database schema pruned from the previous one? I'm a bit confused as I see that the code implementation still uses a complete database schema. Looking forward to your reply, thank you.

@jeffeben21
Copy link

I'm seeing the same, unless I'm misunderstanding the code. The tentative_schema is only used during SS steps (select_tables and select_columns), with candidate generation and revision both configured to use the complete schema. Wouldn't this mean the SS agent has no effect, or am I misunderstanding?

From reading the code, it looks like the CG agent does use the LSH-retrieved value overrides (if retrieved) and only considers the VectorDB-retrieved context based on this (please correct me if wrong here), but the actual schema is complete. Is this correct, and if so is SS having any impact in the current implementation?

@zhangnianlong
Copy link
Author

I'm seeing the same, unless I'm misunderstanding the code. The tentative_schema is only used during SS steps (select_tables and select_columns), with candidate generation and revision both configured to use the complete schema. Wouldn't this mean the SS agent has no effect, or am I misunderstanding?

From reading the code, it looks like the CG agent does use the LSH-retrieved value overrides (if retrieved) and only considers the VectorDB-retrieved context based on this (please correct me if wrong here), but the actual schema is complete. Is this correct, and if so is SS having any impact in the current implementation?

Hello, I have changed the 'complete' in the CG phase code to 'tentative', as shown below:
"DATABASE_SCHEMA": state.get_schema_string(schema_type="tentative")
I think this is reasonable, otherwise it would be meaningless if the results of the previous database schema pruning work were not used when generating SQL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants