-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SELECT * FROM subquery ignores ordering #13904
Comments
I think SQL standard doesn't enforce such ordering, but it's possible most engines implemented it as ordered |
A subquery is a derived table/relation. A relation is unordered, and you can't rely on its tuples being ordered; different database implementations or queries have different behaviors. In #12003, DataFusion intentionally optimized away unnecessary ordering in subqueries. Some discussions of other databases: |
@jonahgao is correct -- this is not a bug. If the top-level query doesn't specify an ordering, the engine is free to optimize the subquery ordering away. |
@TheBuilderJR thanks for opening the issue and please let us know if anything still needed. as @ozankabak correctly mentioned the if there is no top level ordering the engine cannot guarantee the final order. The LIMIT clause gets evaluated after the ORDER clause, so if you rewrite your query
it should work |
Describe the bug
Generally for almost all other sql query engines I expect
to maintain the same order as the subquery, eg.
But in datafusion, it doesn't! It seemingly returns a randomly ordered list.
To Reproduce
Run a SELECT * FROM over an ordered subquery
Expected behavior
ordering is preserved
Additional context
No response
The text was updated successfully, but these errors were encountered: