Skip to content

Commit

Permalink
pcre-7.2
Browse files Browse the repository at this point in the history
bk: 47ff816dvNxEeU5rfP8-N6haqEyBEg
  • Loading branch information
PhilipHazel committed Apr 11, 2008
1 parent 172ede9 commit 1f28038
Show file tree
Hide file tree
Showing 69 changed files with 7,469 additions and 3,566 deletions.
72 changes: 72 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,6 +1,78 @@
ChangeLog for PCRE
------------------

Version 7.2 19-June-07
---------------------

1. If the fr_FR locale cannot be found for test 3, try the "french" locale,
which is apparently normally available under Windows.

2. Re-jig the pcregrep tests with different newline settings in an attempt
to make them independent of the local environment's newline setting.

3. Add code to configure.ac to remove -g from the CFLAGS default settings.

4. Some of the "internals" tests were previously cut out when the link size
was not 2, because the output contained actual offsets. The recent new
"Z" feature of pcretest means that these can be cut out, making the tests
usable with all link sizes.

5. Implemented Stan Switzer's goto replacement for longjmp() when not using
stack recursion. This gives a massive performance boost under BSD, but just
a small improvement under Linux. However, it saves one field in the frame
in all cases.

6. Added more features from the forthcoming Perl 5.10:

(a) (?-n) (where n is a string of digits) is a relative subroutine or
recursion call. It refers to the nth most recently opened parentheses.

(b) (?+n) is also a relative subroutine call; it refers to the nth next
to be opened parentheses.

(c) Conditions that refer to capturing parentheses can be specified
relatively, for example, (?(-2)... or (?(+3)...

(d) \K resets the start of the current match so that everything before
is not part of it.

(e) \k{name} is synonymous with \k<name> and \k'name' (.NET compatible).

(f) \g{name} is another synonym - part of Perl 5.10's unification of
reference syntax.

(g) (?| introduces a group in which the numbering of parentheses in each
alternative starts with the same number.

(h) \h, \H, \v, and \V match horizontal and vertical whitespace.

7. Added two new calls to pcre_fullinfo(): PCRE_INFO_OKPARTIAL and
PCRE_INFO_JCHANGED.

8. A pattern such as (.*(.)?)* caused pcre_exec() to fail by either not
terminating or by crashing. Diagnosed by Viktor Griph; it was in the code
for detecting groups that can match an empty string.

9. A pattern with a very large number of alternatives (more than several
hundred) was running out of internal workspace during the pre-compile
phase, where pcre_compile() figures out how much memory will be needed. A
bit of new cunning has reduced the workspace needed for groups with
alternatives. The 1000-alternative test pattern now uses 12 bytes of
workspace instead of running out of the 4096 that are available.

10. Inserted some missing (unsigned int) casts to get rid of compiler warnings.

11. Applied patch from Google to remove an optimization that didn't quite work.
The report of the bug said:

pcrecpp::RE("a*").FullMatch("aaa") matches, while
pcrecpp::RE("a*?").FullMatch("aaa") does not, and
pcrecpp::RE("a*?\\z").FullMatch("aaa") does again.

12. If \p or \P was used in non-UTF-8 mode on a character greater than 127
it matched the wrong number of bytes.


Version 7.1 24-Apr-07
---------------------

Expand Down
7 changes: 6 additions & 1 deletion HACKING
Original file line number Diff line number Diff line change
Expand Up @@ -129,13 +129,18 @@ These items are all just one byte long
OP_ANYBYTE match any single byte, even in UTF-8 mode
OP_SOD match start of data: \A
OP_SOM, start of match (subject + offset): \G
OP_SET_SOM, set start of match (\K)
OP_CIRC ^ (start of data, or after \n in multiline)
OP_NOT_WORD_BOUNDARY \W
OP_WORD_BOUNDARY \w
OP_NOT_DIGIT \D
OP_DIGIT \d
OP_NOT_HSPACE \H
OP_HSPACE \h
OP_NOT_WHITESPACE \S
OP_WHITESPACE \s
OP_NOT_VSPACE \V
OP_VSPACE \v
OP_NOT_WORDCHAR \W
OP_WORDCHAR \w
OP_EODN match end of data or \n at end: \Z
Expand Down Expand Up @@ -399,4 +404,4 @@ at compile time, and so does not cause anything to be put into the compiled
data.

Philip Hazel
November 2006
June 2007
5 changes: 4 additions & 1 deletion Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,7 @@ EXTRA_DIST += \
testdata/testinput7 \
testdata/testinput8 \
testdata/testinput9 \
testdata/testinput10 \
testdata/testoutput1 \
testdata/testoutput2 \
testdata/testoutput3 \
Expand All @@ -277,12 +278,14 @@ EXTRA_DIST += \
testdata/testoutput7 \
testdata/testoutput8 \
testdata/testoutput9 \
testdata/testoutput10 \
perltest.pl

CLEANFILES += \
testsavedregex \
teststderr \
testtry
testtry \
testNinput

# PCRE demonstration program
noinst_PROGRAMS += pcredemo
Expand Down
14 changes: 8 additions & 6 deletions Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,8 @@ check_SCRIPTS =
dist_noinst_SCRIPTS = RunTest RunGrepTest

# Additional files to delete on 'make clean' and 'make maintainer-clean'.
CLEANFILES = pcre_chartables.c testsavedregex teststderr testtry
CLEANFILES = pcre_chartables.c testsavedregex teststderr testtry \
testNinput
MAINTAINERCLEANFILES = pcre.h.generic

# Additional files to bundle with the distribution, over and above what
Expand Down Expand Up @@ -435,11 +436,12 @@ EXTRA_DIST = doc/perltest.txt NON-UNIX-USE HACKING PrepareRelease \
testdata/grepoutput8 testdata/grepoutputN testdata/testinput1 \
testdata/testinput2 testdata/testinput3 testdata/testinput4 \
testdata/testinput5 testdata/testinput6 testdata/testinput7 \
testdata/testinput8 testdata/testinput9 testdata/testoutput1 \
testdata/testoutput2 testdata/testoutput3 testdata/testoutput4 \
testdata/testoutput5 testdata/testoutput6 testdata/testoutput7 \
testdata/testoutput8 testdata/testoutput9 perltest.pl \
$(pcrecpp_man) CMakeLists.txt config-cmake.h.in
testdata/testinput8 testdata/testinput9 testdata/testinput10 \
testdata/testoutput1 testdata/testoutput2 testdata/testoutput3 \
testdata/testoutput4 testdata/testoutput5 testdata/testoutput6 \
testdata/testoutput7 testdata/testoutput8 testdata/testoutput9 \
testdata/testoutput10 perltest.pl $(pcrecpp_man) \
CMakeLists.txt config-cmake.h.in

# These are the header files we'll install. We do not distribute pcre.h because
# it is generated from pcre.h.in.
Expand Down
33 changes: 33 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,39 @@ News about PCRE releases
------------------------


Release 7.2 19-Jun-07
---------------------

WARNING: saved patterns that were compiled by earlier versions of PCRE must be
recompiled for use with 7.2 (necessitated by the addition of \K, \h, \H, \v,
and \V).

Correction to the notes for 7.1: the note about shared libraries for Windows is
wrong. Previously, three libraries were built, but each could function
independently. For example, the pcreposix library also included all the
functions from the basic pcre library. The change is that the three libraries
are no longer independent. They are like the Unix libraries. To use the
pcreposix functions, for example, you need to link with both the pcreposix and
the basic pcre library.

Some more features from Perl 5.10 have been added:

(?-n) and (?+n) relative references for recursion and subroutines.

(?(-n) and (?(+n) relative references as conditions.

\k{name} and \g{name} are synonyms for \k<name>.

\K to reset the start of the matched string; for example, (foo)\Kbar
matches bar preceded by foo, but only sets bar as the matched string.

(?| introduces a group where the capturing parentheses in each alternative
start from the same number; for example, (?|(abc)|(xyz)) sets capturing
parentheses number 1 in both cases.

\h, \H, \v, \V match horizontal and vertical whitespace, respectively.


Release 7.1 24-Apr-07
---------------------

Expand Down
16 changes: 13 additions & 3 deletions NON-UNIX-USE
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ The following are generic comments about building the PCRE C library "by hand".
An alternative approach is not to edit config.h, but to use -D on the
compiler command line to make any changes that you need.

NOTE: There have been occasions when the way in which certain parameters in
config.h are used has changed between releases. (In the configure/make
world, this is handled automatically.) When upgrading to a new release, you
are strongly advised to review config.h.generic before re-using what you
had previously.

(2) Copy or rename the file pcre.h.generic as pcre.h.

(3) EITHER:
Expand Down Expand Up @@ -127,7 +133,7 @@ for use with VP/Borland: makevp_c.txt, makevp_l.txt, makevp.bat, pcregexp.pas.

COMMENTS ABOUT WIN32 BUILDS

There are two ways of building PCRE using the "congifure, make, make install"
There are two ways of building PCRE using the "configure, make, make install"
paradigm on Windows systems: using MinGW or using Cygwin. These are not at all
the same thing; they are completely different from each other. There is also
some experimental, undocumented support for building using "cmake", which you
Expand Down Expand Up @@ -159,7 +165,11 @@ On both MinGW and Cygwin, PCRE should build correctly using:
./configure && make && make install

This should create two libraries called libpcre and libpcreposix, and, if you
have enabled building the C++ wrapper, a third one called libpcrecpp.
have enabled building the C++ wrapper, a third one called libpcrecpp. These are
independent libraries: when you like with libpcreposix or libpcrecpp you must
also link with libpcre, which contains the basic functions. (Some earlier
releases of PCRE included the basic libpcre functions in libpcreposix. This no
longer happens.)

If you want to statically link your program against a non-dll .a file, you must
define PCRE_STATIC before including pcre.h, otherwise the pcre_malloc() and
Expand Down Expand Up @@ -274,5 +284,5 @@ $! Locale could not be set to fr
$!
=========================

Last Updated: 24 April 2007
Last Updated: 13 June 2007
****
7 changes: 7 additions & 0 deletions PrepareRelease
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,11 @@ files="\
echo Detrailing
./Detrail $files doc/p* doc/html/*

echo Doing basic configure to get default pcre.h and config.h
# This is in case the caller has set aliases (as I do - PH)
unset cp ls mv rm
./configure >/dev/null

echo Converting pcre.h and config.h to generic forms
cp -f pcre.h pcre.h.generic

Expand All @@ -206,4 +211,6 @@ perl <<'END'
close OUT;
END

echo Done

#End
39 changes: 22 additions & 17 deletions RunGrepTest
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ echo "---------------------------- Test 49 ------------------------------" >>tes

# Now compare the results.

$cf testtry $srcdir/testdata/grepoutput
$cf $srcdir/testdata/grepoutput testtry
if [ $? != 0 ] ; then exit 1; fi


Expand All @@ -219,40 +219,45 @@ if [ $utf8 -ne 0 ] ; then
echo "---------------------------- Test U2 ------------------------------" >>testtry
(cd $srcdir; $valgrind $pcregrep -n -u -C 3 --newline=any "Match" ./testdata/grepinput8) >>testtry

$cf testtry $srcdir/testdata/grepoutput8
$cf $srcdir/testdata/grepoutput8 testtry
if [ $? != 0 ] ; then exit 1; fi

else
echo "Skipping pcregrep UTF-8 tests: no UTF-8 support in PCRE library"
fi


# The tests for various newline values may not work in environments where
# the newlines in the files are not \n.
# We go to some contortions to try to ensure that the tests for the various
# newline settings will work in environments where the normal newline sequence
# is not \n. Do not use exported files, whose line endings might be changed.
# Instead, create an input file using printf so that its contents are exactly
# what we want. Note the messy fudge to get printf to write a string that
# starts with a hyphen.

echo "Testing pcregrep newline settings"
printf "abc\rdef\r\nghi\njkl" >testNinput

echo "---------------------------- Test N1 ------------------------------" >testtry
(cd $srcdir; $valgrind $pcregrep -N CR "^(abc|def|ghi|jkl)" ./testdata/grepinputx) >>testtry
printf "%c--------------------------- Test N1 ------------------------------\r\n" - >testtry
$valgrind $pcregrep -n -N CR "^(abc|def|ghi|jkl)" testNinput >>testtry

echo "---------------------------- Test N2 ------------------------------" >>testtry
(cd $srcdir; $valgrind $pcregrep --newline=crlf "^(abc|def|ghi|jkl)" ./testdata/grepinputx) >>testtry
printf "%c--------------------------- Test N2 ------------------------------\r\n" - >>testtry
$valgrind $pcregrep -n --newline=crlf "^(abc|def|ghi|jkl)" testNinput >>testtry

echo "---------------------------- Test N3 ------------------------------" >>testtry
printf "%c--------------------------- Test N3 ------------------------------\r\n" - >>testtry
pattern=`printf 'def\rjkl'`
(cd $srcdir; $valgrind $pcregrep --newline=cr -F "$pattern" ./testdata/grepinputx) >>testtry
$valgrind $pcregrep -n --newline=cr -F "$pattern" testNinput >>testtry

echo "---------------------------- Test N4 ------------------------------" >>testtry
printf "%c--------------------------- Test N4 ------------------------------\r\n" - >>testtry
pattern=`printf 'xxx\r\njkl'`
(cd $srcdir; $valgrind $pcregrep --newline=crlf -F "$pattern" ./testdata/grepinputx) >>testtry
$valgrind $pcregrep -n --newline=crlf -F "$pattern" testNinput >>testtry

echo "---------------------------- Test N5 ------------------------------" >>testtry
(cd $srcdir; $valgrind $pcregrep -n --newline=any "^(abc|def|ghi|jkl)" ./testdata/grepinputx) >>testtry
printf "%c--------------------------- Test N5 ------------------------------\r\n" - >>testtry
$valgrind $pcregrep -n --newline=any "^(abc|def|ghi|jkl)" testNinput >>testtry

echo "---------------------------- Test N6 ------------------------------" >>testtry
(cd $srcdir; $valgrind $pcregrep -n --newline=anycrlf "^(abc|def|ghi|jkl)" ./testdata/grepinputx) >>testtry
printf "%c--------------------------- Test N6 ------------------------------\r\n" - >>testtry
$valgrind $pcregrep -n --newline=anycrlf "^(abc|def|ghi|jkl)" testNinput >>testtry

$cf testtry $srcdir/testdata/grepoutputN
$cf $srcdir/testdata/grepoutputN testtry
if [ $? != 0 ] ; then exit 1; fi

exit 0
Expand Down
Loading

0 comments on commit 1f28038

Please sign in to comment.