Skip to content

Commit

Permalink
After removing collation for [a-z] ranges in r302512, do it here too.
Browse files Browse the repository at this point in the history
Instead of trying to expand whole range at regcomp() stage as we do,
GNU regex allocates separate ranges [start,end] set each character
is checked against, so collation is possible and turned on for ranges here.

When something like that will be implemented or our obsoleted regex code
will be replaced to something like TRE, and in case we decide to use
collation in [a-z] ranges, all changes related to r302512 can be backed out,
but now we need consistency.
  • Loading branch information
Andrey A. Chernov authored and Andrey A. Chernov committed Jul 13, 2016
1 parent f4dc9bf commit f04b8af
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
9 changes: 9 additions & 0 deletions contrib/libgnuregex/regcomp.c
Original file line number Diff line number Diff line change
Expand Up @@ -2664,7 +2664,11 @@ build_range_exp (bitset_t sbcset, bracket_elem_t *start_elem,
return REG_ECOLLATE;
cmp_buf[0] = start_wc;
cmp_buf[4] = end_wc;
#ifdef __FreeBSD__
if (wcscmp (cmp_buf, cmp_buf + 4) > 0)
#else
if (wcscoll (cmp_buf, cmp_buf + 4) > 0)
#endif
return REG_ERANGE;

/* Got valid collation sequence values, add them as a new entry.
Expand Down Expand Up @@ -2706,8 +2710,13 @@ build_range_exp (bitset_t sbcset, bracket_elem_t *start_elem,
for (wc = 0; wc < SBC_MAX; ++wc)
{
cmp_buf[2] = wc;
#ifdef __FreeBSD__
if (wcscmp (cmp_buf, cmp_buf + 2) <= 0
&& wcscmp (cmp_buf + 2, cmp_buf + 4) <= 0)
#else
if (wcscoll (cmp_buf, cmp_buf + 2) <= 0
&& wcscoll (cmp_buf + 2, cmp_buf + 4) <= 0)
#endif
bitset_set (sbcset, wc);
}
}
Expand Down
5 changes: 5 additions & 0 deletions contrib/libgnuregex/regexec.c
Original file line number Diff line number Diff line change
Expand Up @@ -3964,8 +3964,13 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx,
{
cmp_buf[0] = cset->range_starts[i];
cmp_buf[4] = cset->range_ends[i];
#ifdef __FreeBSD__
if (wcscmp (cmp_buf, cmp_buf + 2) <= 0
&& wcscmp (cmp_buf + 2, cmp_buf + 4) <= 0)
#else
if (wcscoll (cmp_buf, cmp_buf + 2) <= 0
&& wcscoll (cmp_buf + 2, cmp_buf + 4) <= 0)
#endif
{
match_len = char_len;
goto check_node_accept_bytes_match;
Expand Down

0 comments on commit f04b8af

Please sign in to comment.