2 Regexp::Wildcards - Converts wildcard expressions to Perl regular
9 use Regexp::Wildcards qw/wc2re/;
12 $re = wc2re 'a{b?,c}*' => 'unix'; # Do it Unix style.
13 $re = wc2re 'a?,b*' => 'win32'; # Do it Windows style.
14 $re = wc2re '*{x,y}?' => 'jokers'; # Process the jokers & escape the rest.
17 In many situations, users may want to specify patterns to match but
18 don't need the full power of regexps. Wildcards make one of those sets
19 of simplified rules. This module converts wildcard expressions to Perl
20 regular expressions, so that you can use them for matching. It handles
21 the "*" and "?" jokers, as well as Unix bracketed alternatives "{,}",
22 and uses the backspace ("\") as an escape character. Wrappers are
23 provided to mimic the behaviour of Windows and Unix shells.
26 These variables control if the wildcards jokers and brackets must
27 capture their match. They can be globally set by writing in your program
29 $Regexp::Wildcards::CaptureSingle = 1;
30 # From then, the '?' joker is capturing
32 or can be locally specified via "local"
35 local $Regexp::Wildcards::CaptureAny = 1;
36 # In this block, the '?' joker is capturing.
39 # Back to the situation from before the block
41 This section describes also how those elements are translated by the
45 When this variable is true, each occurence of the unescaped "?" joker is
46 made capturing in the resulting regexp (they are be replaced by "(.)").
47 Otherwise, they are just replaced by ".". Default is the latter.
49 'a???b\\??' is translated to 'a(.)(.)(.)b\\?(.)' if $CaptureSingle is true
50 'a...b\\?.' otherwise (default)
53 By default this variable is false, and successions of unescaped "*"
54 jokers are replaced by one single ".*". When it evalutes to true, those
55 sequences of "*" are made into one capture, which is greedy ("(.*)") for
56 "$CaptureAny > 0" and otherwise non-greedy ("(.*?)").
58 'a***b\\**' is translated to 'a.*b\\*.*' if $CaptureAny is false (default)
59 'a(.*)b\\*(.*)' if $CaptureAny > 0
60 'a(.*?)b\\*(.*?)' otherwise
63 If this variable is set to true, valid brackets constructs are made into
64 "( | )" captures, and otherwise they are replaced by non-capturing
65 alternations ("(?: | ")), which is the default.
67 'a{b\\},\\{c}' is translated to 'a(b\\}|\\{c)' if $CaptureBrackets is true
68 'a(?:b\\}|\\{c)' otherwise (default)
72 This function takes as its only argument the wildcard string to process,
73 and returns the corresponding regular expression where the jokers "?"
74 and "*" have been translated into their regexp equivalents (see
75 "VARIABLES" for more details). All other unprotected regexp
76 metacharacters are escaped.
78 # Everything is escaped.
79 print 'ok' if wc2re_jokers('{a{b,c}d,e}') eq '\\{a\\{b\\,c\\}d\\,e\\}';
82 Similar to the precedent, but this one conforms to standard Unix shell
83 wildcard rules. It successively escapes all unprotected regexp special
84 characters that doesn't hold any meaning for wildcards, turns jokers
85 into their regexp equivalents (see "wc2re_jokers"), and changes
86 bracketed blocks into (possibly capturing) alternations as described in
87 "VARIABLES". If brackets are unbalanced, it tries to substitute as many
88 of them as possible, and then escape the remaining "{" and "}". Commas
89 outside of any bracket-delimited block are also escaped.
91 # This is a valid bracket expression, and is completely translated.
92 print 'ok' if wc2re_unix('{a{b,c}d,e}') eq '(?:a(?:b|c)d|e)';
94 The function handles unbalanced bracket expressions, by escaping
95 everything it can't recognize. For example :
97 # The first comma is replaced, and the remaining brackets and comma are escaped.
98 print 'ok' if wc2re_unix('{a\\{b,c}d,e}') eq '(?:a\\{b|c)d\\,e\\}';
100 # All the brackets and commas are escaped.
101 print 'ok' if wc2re_unix('{a{b,c\\}d,e}') eq '\\{a\\{b\\,c\\}d\\,e\\}';
104 This one works just like the two before, but for Windows wildcards.
105 Bracketed blocks are no longer handled (which means that brackets are
106 escaped), but you can provide a comma-separated list of items.
108 # All the brackets are escaped, and commas are seen as list delimiters.
109 print 'ok' if wc2re_win32('{a{b,c}d,e}') eq '(?:\\{a\\{b|c\\}d|e\\})';
112 A generic function that wraps around all the different rules. The first
113 argument is the wildcard expression, and the second one is the type of
114 rules to apply which can be :
116 'unix', 'win32', 'jokers'
117 For one of those raw rule names, "wc2re" simply maps to
118 "wc2re_unix", "wc2re_win32" and "wc2re_jokers" respectively.
120 $^O If you supply the Perl operating system name, the call is deferred
121 to "wc2re_win32" for $^O equal to 'dos', 'os2', 'MSWin32' or
122 'cygwin', and to "wc2re_unix" in all the other cases.
124 If the type is undefined or not supported, it defaults to 'unix'.
126 # Wraps to wc2re_jokers ($re eq 'a\\{b\\,c\\}.*').
127 $re = wc2re 'a{b,c}*' => 'jokers';
129 # Wraps to wc2re_win32 ($re eq '(?:a\\{b|c\\}.*)')
130 # or wc2re_unix ($re eq 'a(?:b|c).*') depending on $^O.
131 $re = wc2re 'a{b,c}*' => $^O;
134 These four functions are exported only on request : "wc2re",
135 "wc2re_unix", "wc2re_win32" and "wc2re_jokers". The variables are not
139 Text::Balanced, which is bundled with perl since version 5.7.3
142 This module does not implement the strange behaviours of Windows shell
143 that result from the special handling of the three last characters (for
144 the file extension). For example, Windows XP shell matches *a like
145 ".*a", "*a?" like ".*a.?", "*a??" like ".*a.{0,2}" and so on.
148 Some modules provide incomplete alternatives as helper functions :
150 Net::FTPServer has a method for that. Only jokers are translated, and
151 escaping won't preserve them.
153 File::Find::Match::Util has a "wildcard" function that compiles a
154 matcher. It only handles "*".
156 Text::Buffer has the "convertWildcardToRegex" class method that handles
160 Vincent Pit, "<perl at profvince.com>"
163 Please report any bugs or feature requests to "bug-regexp-wildcards at
164 rt.cpan.org", or through the web interface at
165 <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Regexp-Wildcards>. I
166 will be notified, and then you'll automatically be notified of progress
167 on your bug as I make changes.
170 You can find documentation for this module with the perldoc command.
172 perldoc Regexp::Wildcards
175 Copyright 2007 Vincent Pit, all rights reserved.
177 This program is free software; you can redistribute it and/or modify it
178 under the same terms as Perl itself.