Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to parse SQL (postgres) with commented out semicolon #4762

Closed
dbittenbender opened this issue Feb 18, 2025 · 3 comments · Fixed by #4772
Closed

Failed to parse SQL (postgres) with commented out semicolon #4762

dbittenbender opened this issue Feb 18, 2025 · 3 comments · Fixed by #4772

Comments

@dbittenbender
Copy link

dbittenbender commented Feb 18, 2025

-- COMMENT
-------------------------------------------------------------------------------
TRUNCATE table1
;


-- COMMENT
-------------------------------------------------------------------------------
WITH
    table5 AS (
        SELECT DISTINCT
            col1,
            col2,
            col3
        FROM
            table6 cdep
    ),
    table4 AS (
        SELECT
            col1,
            col2,
            col3 --
        FROM
            table5 ltt
    ),
    table3 AS (
        SELECT
            col1,
            col2,
            col3
        FROM
            table4
    ),
    table2 AS (
        SELECT
            col1,
            col2,
            col3
        FROM
            table3
    )
INSERT INTO
    table1 (
        col1,
        col2,
        col3
    )
SELECT
    col1,
    col2,
    col3
FROM
    table2
;

-- vacuum after
-- VACUUM VERBOSE
-- ANALYZE table1
-- ;

The above SQL is a simplified example as I can't post client code. However, it fails the same way.

I am using read dialect of postgres.

I am first calling parse(sql_stmt, read=dialect) with the entire statement expecting to get all of the statements returned as a List. That returns 3 sub-statements: TruncateTable, Insert, and Semicolon. The first 2 are correct. However, the entire 3rd statement (Semicolon) is the commented out lines at the bottom of the full sql statement.

Then I try to parse each one of the sub-statements: ast = parse_one(sql_stmt, read=dialect). When I get to the 3rd statement (Semicolon), it fails with:

ParseError("No expression was parsed from '/* vacuum after */ /* VACUUM VERBOSE */ /* ANALYZE table1 */ /* ; */'")

It seems like when it parses the end of the file, it get's thrown off by the commented out semicolon.

@dbittenbender
Copy link
Author

Also seeing a ton of log messages like this:

INFO:sqlglot:Applying array index offset (1)
INFO:sqlglot:Applying array index offset (1)
INFO:sqlglot:Applying array index offset (1)
INFO:sqlglot:Applying array index offset (-1)
INFO:sqlglot:Applying array index offset (-1)
INFO:sqlglot:Applying array index offset (-1)

Not sure where it's coming from or if it's relevant

@VaggelisD
Copy link
Collaborator

Hey @dbittenbender,

Regarding the comments, this is expected and is happening for the following reasons:

  • For the parse example, the comments are appended to the previous semicolon (support introduced here)
  • For the parse_one example, the failure is correct in that there's indeed no valid expression to be parsed (comments are attached to their nearby expressions, so we need at least 1); What does look faulty is that even if we add a semicolon at the end the parse error is there, so I can add a solution to mitigate that.

As far as the array indexes are concerned, SQLGlot has logic to canonicalize array indexes between dialects (some are 0-based while others are 1-based etc), but I'd expect to see these during transpilation. Does this lead to incorrect roundtrips in Postgres? Would you be able to find a repro?

@dbittenbender
Copy link
Author

Sorry for the delay in response. I was away. Thanks for your help!

I will skip the Semicolon types.

For the array indexes, it's not causing any problems that I am aware of, I just never saw that many in the logs before so i wasn't sure if it was related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants