2021 update: this is no longer maintained. Try https://github.com/MarcelBolten/phpeggy instead.
A PHP code generation plugin for PEG.js.
Fork of
php-pegjs
.
- PEG.js (known compatible with v0.10.0)
Install PEG.js with phpegjs
plugin
$ npm install phpegjs
In Node.js, require both the PEG.js parser generator and the phpegjs
plugin:
var pegjs = require("pegjs");
var phpegjs = require("phpegjs");
To generate a PHP parser, pass both the phpegjs
plugin and your grammar to
pegjs.generate
:
var parser = pegjs.generate("start = ('a' / 'b')+", {
plugins: [phpegjs]
});
The method will return source code of generated parser as a string. Unlike original PEG.js, generated PHP parser will be a class, not a function.
Supported options of pegjs.generate
:
cache
— iftrue
, makes the parser cache results, avoiding exponential parsing time in pathological cases but making the parser slower (default:false
). In case of PHP, this is strongly recommended for big grammars (like javascript.pegjs or css.pegjs in example folder)allowedStartRules
— rules the parser will be allowed to start parsing from (default: the first rule in the grammar)
You can also pass options specific to the PHP PEG.js plugin as follows:
var parser = pegjs.generate("start = ('a' / 'b')+", {
plugins: [phpegjs],
phpegjs: { /* phpegjs-specific options */ }
});
Here are the options available to pass this way:
parserNamespace
- namespace of generated parser (default:PhpPegJs
). If value is''
ornull
, no namespace will be used (and the generated parser will be compatible with PHP 5.2).parserGlobalNamePrefix
- prefix to add to all globally defined names including the parser, its helper functions, and theSyntaxError
class. This should only be used if PHP 5.2 compatibility is needed; otherwise theparserNamespace
option should be used instead.parserClassName
- name of generated class for parser (default:Parser
). Note that if aparserGlobalNamePrefix
is specified, this prefix will be added to the name specified byparserClassName
.mbstringAllowed
- whether to allow usage of PHP'smb_*
functions which depend on thembstring
extension being installed (default:true
). This can be disabled for compatibility with a wider range of PHP configurations, but this will also disable several features of PEG.js (case-insensitive string matching, case-insensitive character classes, and empty character classes). Attempting to use these features withmbstringAllowed: false
will causegenerate
to throw an error.
-
Save parser generated by
pegjs.generate
to a file -
In PHP code:
include "your.parser.file.php";
try {
$parser = new PhpPegJs\Parser;
$result = $parser->parse($input);
} catch (PhpPegJs\SyntaxError $ex) {
// Handle parsing error
// [...]
}
You can use the following snippet to format parsing errors:
catch (PhpPegJs\SyntaxError $ex) {
$message = "Syntax error: " . $ex->getMessage() . ' at line ' . $ex->grammarLine . ' column ' . $ex->grammarColumn . ' offset ' . $ex->grammarOffset;
}
Note that the generated PHP parser will call preg_match_all( '/./us', ... )
on the input string. This may be undesirable for projects that need to
maintain compatibility with PCRE versions that are missing Unicode support
(WordPress, for example). To avoid this call, split the input string into an
array (one array element per UTF-8 character) and pass this array into
$parser->parse()
instead of the string input.
See documentation of PEG.js with one difference: action blocks should be written in PHP.
Original PEG.js rule:
media_list = head:medium tail:("," S* medium)* {
var result = [head];
for (var i = 0; i < tail.length; i++) {
result.push(tail[i][2]);
}
return result;
}
PHP PEG.js rule:
media_list = head:medium tail:("," S* medium)* {
$result = array($head);
for ($i = 0; $i < count($tail); $i++) {
$result[] = $tail[$i][2];
}
return $result;
}
To target both JavaScript and PHP with a single grammar, you can mix the two languages using a special comment syntax:
media_list = head:medium tail:("," S* medium)* {
/** <?php
$result = array($head);
for ($i = 0; $i < count($tail); $i++) {
$result[] = $tail[$i][2];
}
return $result;
?> **/
var result = [head];
for (var i = 0; i < tail.length; i++) {
result.push(tail[i][2]);
}
return result;
}
You can also use the following utility functions in PHP action blocks:
chr_unicode($code)
- return character by its UTF-8 code (analogue of JavaScript'sString.fromCharCode
function).ord_unicode($code)
- return the UTF-8 code for a character (analogue of JavaScript'sString.prototype.charCodeAt(0)
function).
Javascript code | PHP analogue |
---|---|
some_var |
$some_var |
{f1: "val1", f2: "val2"} |
array("f1" => "val1", "f2" => "val2") |
["val1", "val2"] |
array("val1", "val2") |
some_array.push("val") |
$some_array[] = "val" |
some_array.length |
count($some_array) |
some_array.join("") |
join("", $some_array) |
some_array1.concat(some_array2) |
array_merge($some_array1, $some_array2) |
parseInt("23") |
intval("23") |
parseFloat("23.1") |
floatval("23.1") |
some_str.length |
mb_strlen(some_str, "UTF-8") |
some_str.replace("b", "\b") |
str_replace("b", "\b", $some_str) |
String.fromCharCode(2323) |
chr_unicode(2323) |