re::engine::Plugin - API to write custom regex engines
+=head1 VERSION
+
+Version 0.12
+
=head1 DESCRIPTION
As of perl 5.9.5 it's possible to lexically replace perl's built-in
depending on variable interpolation etc.
When this module is loaded into a scope it inserts a hook into
-C<$^H{regcomp}> (as described in L<perlreapi>) to have each regexp
-constructed in its lexical scope handled by this engine, but it
-differs from other engines in that it also inserts other hooks into
-C<%^H> in the same scope that point to user-defined subroutines to use
-during compilation, execution etc, these are described in
-L</CALLBACKS> below.
+C<$^H{regcomp}> (as described in L<perlreapi> and L<perlpragma>) to
+have each regexp constructed in its lexical scope handled by this
+engine, but it differs from other engines in that it also inserts
+other hooks into C<%^H> in the same scope that point to user-defined
+subroutines to use during compilation, execution etc, these are
+described in L</CALLBACKS> below.
The callbacks (e.g. L</comp>) then get called with a
L<re::engine::Plugin> object as their first argument. This object
use re::engine::Plugin (
comp => sub {},
exec => sub {},
+ free => sub {},
);
To write a custom engine which imports your functions into the
sub import
{
- # Populates the caller's %^H with our callbacks
+ # Sets the caller's $^H{regcomp} his %^H with our callbacks
re::engine::Plugin->import(
comp => \&comp,
exec => \&exec,
+ free => \&free,
);
}
# Implementation of the engine
sub comp { ... }
sub exec { ... }
+ sub free { ... }
1;
comp => sub {
my $rx = shift;
croak "Your pattern is invalid"
- unless $rx->pattern ~~ /pony/;
+ unless $rx->pattern =~ /pony/;
}
);
=head2 exec
- exec => sub {
- my ($rx, $str) = @_;
+ my $ponies;
+ use re::engine::Plugin(
+ exec => sub {
+ my ($rx, $str) = @_;
- # We always like ponies!
- return 1 if $str ~~ /pony/;
+ # We always like ponies!
+ if ($str =~ /pony/) {
+ $ponies++;
+ return 1;
+ }
- # Failed to match
- return;
- }
+ # Failed to match
+ return;
+ }
+ );
Called when a regex is being executed, i.e. when it's being matched
against something. The scalar being matched against the pattern is
method. The routine should return a true value if the match was
successful, and a false one if it wasn't.
+This callback can also be specified on an individual basis with the
+L</callbacks> method.
+
+=head2 free
+
+ use re::engine::Plugin(
+ free => sub {
+ my ($rx) = @_;
+
+ say 'matched ' ($ponies // 'no')
+ . ' pon' . ($ponies > 1 ? 'ies' : 'y');
+
+ return;
+ }
+ );
+
+Called when the regexp structure is freed by the perl interpreter.
+Note that this happens pretty late in the destruction process, but
+still before global destruction kicks in. The only argument this
+callback receives is the C<re::engine::Plugin> object associated
+with the regexp, and its return value is ignored.
+
+This callback can also be specified on an individual basis with the
+L</callbacks> method.
+
=head1 METHODS
=head2 str
- "str" ~~ /pattern/;
+ "str" =~ /pattern/;
# in comp/exec/methods:
my $str = $rx->str;
=head2 mod
my %mod = $rx->mod;
- say "has /ix" if %mod ~~ 'i' and %mod ~~ 'x';
+ say "has /ix" if %mod =~ 'i' and %mod =~ 'x';
A key-value pair list of the modifiers the pattern was compiled with.
The keys will zero or more of C<imsxp> and the values will be true
The length specified will be used as a a byte length (using
L<SvPV|perlapi/SvPV>), not a character length.
+=head2 nparens
+
+=head2 gofs
+
+=head2 callbacks
+
+ # A dumb regexp engine that just tests string equality
+ use re::engine::Plugin comp => sub {
+ my ($re) = @_;
+
+ my $pat = $re->pattern;
+
+ $re->callbacks(
+ exec => sub {
+ my ($re, $str) = @_;
+ return $pat eq $str;
+ },
+ );
+ };
+
+Takes a list of key-value pairs of names and subroutines, and replace the
+callback currently attached to the regular expression for the type given as
+the key by the code reference passed as the corresponding value.
+
+The only valid keys are currently C<exec> and C<free>. See L</exec> and
+L</free> for more details about these callbacks.
+
=head2 num_captures
$re->num_captures(
L<Tie::Hash> methods FETCH, STORE, DELETE, CLEAR, EXISTS, FIRSTKEY,
NEXTKEY and SCALAR.
-=head1 Tainting
+=head1 CONSTANTS
+
+=head2 C<REP_THREADSAFE>
+
+True iff the module could have been built with thread-safety features
+enabled.
+
+=head2 C<REP_FORKSAFE>
+
+True iff this module could have been built with fork-safety features
+enabled. This will always be true except on Windows where it's false
+for perl 5.10.0 and below.
+
+=head1 TAINTING
The only way to untaint an existing variable in Perl is to use it as a
hash key or referencing subpatterns from a regular expression match
my ($re, $paren) = @_;
# This is perl's engine doing the match
- $str ~~ /(.*)/;
+ $str =~ /(.*)/;
# $1 has been untainted
return $1;
L<perlreapi>, L<Taint::Util>
-=head1 TODO / CAVEATS
+=head1 TODO & CAVEATS
I<here be dragons>
=back
+=head1 DEPENDENCIES
+
+L<perl> 5.10.
+
+A C compiler.
+This module may happen to build with a C++ compiler as well, but don't rely on it, as no guarantee is made in this regard.
+
+L<XSLoader> (standard since perl 5.6.0).
+
=head1 BUGS
Please report any bugs that aren't already listed at
L<http://rt.cpan.org/Dist/Display.html?Queue=re-engine-Plugin> to
L<http://rt.cpan.org/Public/Bug/Report.html?Queue=re-engine-Plugin>
-=head1 AUTHOR
+=head1 AUTHORS
-E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason <avar@cpan.org>
+E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason C<< <avar at cpan.org> >>
+
+Vincent Pit C<< <perl at profvince.com> >>
=head1 LICENSE
-Copyright 2007-2008 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason.
+Copyright 2007,2008 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason.
+
+Copyright 2009,2010,2011,2013,2014,2015 Vincent Pit.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.