Skip to content

pg_query Parser Patches for Postgres 13.8 #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: REL_13_STABLE
Choose a base branch
from

Conversation

lfittl
Copy link
Owner

@lfittl lfittl commented Nov 2, 2022

Do not merge. This PR only exists to track the patches that are applied for pg_query (13-latest branch) on top of Postgres 13.8.

lfittl and others added 9 commits November 2, 2022 14:58
Due to pg_stat_statements using $1, etc for substitution of constants, the
parser needs to support additional locations where these values are
allowed to be passed in.

Examples:

CREATE USER test PASSWORD $1;
ALTER USER test ENCRYPTED PASSWORD $2;
SET SCHEMA $3;
SET ROLE $4;
SET SESSION AUTHORIZATION $5;
SET TIME ZONE $6;
SELECT EXTRACT($1 FROM TIMESTAMP $2);
SELECT DATE $1;
SELECT INTERVAL $1;
SELECT INTERVAL $1 YEAR;
SELECT INTERVAL (6) $1;
This is for compatibility with Postgres 9.6 and older, which used ?
as the replacement character in pg_stat_statements.

Note that this intentionally breaks use of ? as an operator in some
uncommon cases.

This patch will likely be removed with the next major parser release, and
should be considered deprecated.
This is helpful for tracking the extent of tokens in the scan output,
as this is made available by pg_query for uses such as syntax highlighting.
For syntax highlighting and extracting comments from a query, its very
helpful to know the exact locations of a comment in the query string.

Previously the lexer discarded all comments as whitespace, making it
impossible to determine where they are located in the query string. With
this change, the lexer returns them as SQL_COMMENT/C_COMMENT tokens.
This seems like an oversight in the commit that added support for
FETCH FIRST... WITH TIES, and causes the parsetree to always have
limitOption = LIMIT_OPTION_COUNT, even when no LIMIT/OFFSET is specified.
This frees up the memory allocated to memory contexts that are kept
for future allocations. This behaves similar to changing aset.c's
MAX_FREE_CONTEXTS to 0, but only does the cleanup when called, and
allows the freelist approach to be used during Postgres operations.
This allows other source units to have the accompanying functions for
the already exported plpgsql_adddatum.
This is a pg_query-specific patch that ensures we can use the split
function on the regression test files. Zero-length delimiters fail
at the scanner level in Postgres, and thus need to be removed.
In the latest version of Apple's macOS SDK, <sys/socket.h>
fails to compile if "REF" is #define'd as something.
Apple may or may not agree that this is a bug, and even if
they do accept the bug report I filed, they probably won't
fix it very quickly.  In the meantime, our back branches will all
fail to compile gram.y.  v15 and HEAD currently escape the problem
thanks to the refactoring done in 98e93a1, but that's purely
accidental.  Moreover, since that patch removed a widely-visible
inclusion of <netdb.h>, back-patching it seems too likely to break
third-party code.

Instead, change the token's code name to REF_P, following our usual
convention for naming parser tokens that are likely to have symbol
conflicts.  The effects of that should be localized to the grammar
and immediately surrounding files, so it seems like a safer answer.

Per project policy that we want to keep recently-out-of-support
branches buildable on modern systems, back-patch all the way to 9.2.

Discussion: https://postgr.es/m/[email protected]
@lfittl lfittl force-pushed the lfittl/pg-query-pg13.8 branch from fd92673 to c3506d3 Compare November 2, 2022 22:11
lfittl pushed a commit that referenced this pull request Jan 17, 2023
In a similar effort to f736e18 and 110d817, fixup various usages of
string functions where a more appropriate function is available and more
fit for purpose.

These changes include:

1. Use cstring_to_text_with_len() instead of cstring_to_text() when
   working with a StringInfoData and the length can easily be obtained.
2. Use appendStringInfoString() instead of appendStringInfo() when no
   formatting is required.
3. Use pstrdup(...) instead of psprintf("%s", ...)
4. Use pstrdup(...) instead of psprintf(...) (with no formatting)
5. Use appendPQExpBufferChar() instead of appendPQExpBufferStr() when the
   length of the string being appended is 1.
6. appendStringInfoChar() instead of appendStringInfo() when no formatting
   is required and string is 1 char long.
7. Use appendPQExpBufferStr(b, .) instead of appendPQExpBuffer(b, "%s", .)
8. Don't use pstrdup when it's fine to just point to the string constant.

I (David) did find other cases of #8 but opted to use #4 instead as I
wasn't certain enough that applying #8 was ok (e.g in hba.c)

Author: Ranier Vilela, David Rowley
Discussion: https://postgr.es/m/CAApHDvo2j2+RJBGhNtUz6BxabWWh2Jx16wMUMWKUjv70Ver1vg@mail.gmail.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants