use Regexp::Common qw /list/; while (<>) { /$RE{list}{-pat => '\w+'}/ and print "List of words"; /$RE{list}{-pat => $RE{num}{real}}/ and print "List of numbers"; }
Do not use this module directly, but load it via Regexp::Common.
If "-pat=P" is specified, it defines the pattern for each substring in the list. By default, P is "qr/.*?\S/". In Regexp::Common 0.02 or earlier, the default pattern was "qr/.*?/". But that will match a single space, causing unintended parsing of "a, b, and c" as a list of four elements instead of 3 (with "-word" being "(?:and)"). One consequence is that a list of the form ``a,,b'' will no longer be parsed. Use the pattern "qr /.*?/" to be able to parse this, but see the previous remark.
If "-sep=P" is specified, it defines the pattern P to be used as a separator between each pair of substrings in the list, except the final two. By default P is "qr/\s*,\s*/".
If "-lastsep=P" is specified, it defines the pattern P to be used as a separator between the final two substrings in the list. By default P is the same as the pattern specified by the "-sep" flag.
For example:
$RE{list}{-pat=>'\w+'} # match a list of word chars $RE{list}{-pat=>$RE{num}{real}} # match a list of numbers $RE{list}{-sep=>"\t"} # match a tab-separated list $RE{list}{-lastsep=>',\s+and\s+'} # match a proper English list
Under "-keep":
If "-word" is not specified, the default pattern is "qr/and|or/".
For example:
$RE{list}{conj}{-word=>'et'} # match Jean, Paul, et Satre $RE{list}{conj}{-word=>'oder'} # match Bonn, Koln oder Hamburg
For a start, there are many common regexes missing. Send them in to regexp-common@abigail.be.
This module is free software, and maybe used under any of the following licenses:
1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD License. See the file COPYRIGHT.BSD. 4) The MIT License. See the file COPYRIGHT.MIT.