Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perlito java regexps generated at runtime need to be double escaped #41

Closed
potyl opened this issue Jan 19, 2016 · 0 comments
Closed

Perlito java regexps generated at runtime need to be double escaped #41

potyl opened this issue Jan 19, 2016 · 0 comments

Comments

@potyl
Copy link
Collaborator

potyl commented Jan 19, 2016

This is a case where Perl code and generated Java code differ in implementation. Basically if we generate in Perl a regexp in real time (as a string) and use it to evaluate a pattern the regexp will fail if we have escapes. What happens is that in Perl \\s has to be used but in Java \\\\s need to be used.

Example:

#!/usr/bin/env perl
use strict;
use warnings;

my $input = 'a b c';
my $regexp_string = qq{a<SPACE>b\\sc};

my $regexp_clean = $regexp_string;
$regexp_clean =~ s/<SPACE>/\\s/;

print "Trying '$input' =~ /$regexp_clean/\n";
$input =~ /$regexp_clean/ or die "Failed regexp clean: '$input' =~ /$regexp_clean/";
print "Ok '$input' =~ /$regexp_clean/\n";

Perl output:

Trying 'a b c' =~ /a\sb\sc/
Ok 'a b c' =~ /a\sb\sc/

Java output:

Trying 'a b c' =~ /asb\sc/
Exception in thread "main" PlDieException: Failed regexp clean: 'a b c' =~ /asb\sc/
    at PlCORE.die(Main.java:114)
    at Main.main(Main.java:3514)

If we modify the original program to this:

#!/usr/bin/env perl
use strict;
use warnings;

my $input = 'a b c';
my $regexp_string = qq{a<SPACE>b\\sc};

my $regexp_clean = $regexp_string;
$regexp_clean =~ s/<SPACE>/\\\\s/;

print "Trying '$input' =~ /$regexp_clean/\n";
$input =~ /$regexp_clean/ or die "Failed regexp clean: '$input' =~ /$regexp_clean/";
print "Ok '$input' =~ /$regexp_clean/\n";

Then we get a failure in Perl:

Trying 'a b c' =~ /a\\sb\sc/
Failed regexp clean: 'a b c' =~ /a\\sb\sc/ at ./sample.pl line 12.

But a success in Java:

Trying 'a b c' =~ /a\sb\sc/
Ok 'a b c' =~ /a\sb\sc/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants