CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Intermediate:
Capture the separator after a split



Jul 4, 2013, 9:08 AM

Post #1 of 2 (3049 views)
Capture the separator after a split Can't Post


For once I have a contribution, albeit a minor tidbit, to make instead of asking a question. Note that my serendipitous discovery is actually documented at the very end of the perldoc page for the split function. (The moral equivalent of the fine print; almost nobody reads that far. Wink ) See:

I made an interesting mistake:

I have two numbers separated by either a + or - sign. for example "2-3". I needed to split the string to capture the two numbers but I also needed to capture the intervening sign so that I could apply it later.

$nstring = "2-3"; 
@numbers = split(/[+-]/, $nstring);

This gives me @numbers is (2, 3) but does not tell me the sign. So how do I know if the latter number is + or -? So I tried surrounding the pattern with parentheses, as I would in a ~ or =~ operation:

@numbers = split(/([+-])/, $nstring);

I thought that would leave the sign in $1. I was wrong; $1 remains undefined. (Actually, in my case, it retained the value from a previous matching operation, which had me scratching my head for a while.) A bigger surprise: The array now contains 3 element instead of 2. When I examined the @numbers array I found (2, -, 3); the splitting pattern was included with the array. Once I realized this, I was able to use the sign; I just had to know that the second number was in slot[2] rather than in [1].

In retrospect, my idea of trying to capture the separator in $1 was wrong headed anyway. Consider a string like this:
I have used 2 separators here, both the colon and semicolon.

my $pwline = "Rasputin:unused;1000:513;U-Maxwell"; 
print "$pwline\n";
my @pwparts1 = split(/[:;]/, $pwline); # May be split by either : or ;
my $partcount = @pwparts1;
print "Split $partcount components: ";
print "[", join("] [", @pwparts1), "]\n\n";

my @pwparts2 = split(/([:;])/, $pwline); # May be split by either : or ;
$partcount = @pwparts2;
my $sep = (defined($1)) ? $1 : "(undefined)";

print "Split $partcount components: ";
print "[", join("] [", @pwparts2), "]\n\n";

Here's the output:

Split 5 components: [Rasputin] [unused] [1000] [513] [U-Maxwell]

Split 9 components: [Rasputin] [:] [unused] [;] [1000] [:] [513] [;] [U-Maxwell]

If I could capture it in $1, which separator would go there? The : or ;?

BTW, the parenthesized pattern is called a "capture group".
-- Rasputin Paskudniak (In perpetual pursuit of undomesticated, semi-aquatic avians)

Veteran / Moderator

Jul 4, 2013, 10:53 AM

Post #2 of 2 (3043 views)
Re: [rpaskudniak] Capture the separator after a split [In reply to] Can't Post

In Reply To
If I could capture it in $1, which separator would go there? The : or ;?

The question is a bit rhetorical, since you can't capture it in $1. But the logics in this kind of things (a list of matches collapsed into a scalar) would be that you would probably get the last match.

A somewhat similar example under the Perl debugger:

  DB<1>  $_ = "foo bar baz too" 

DB<2> print $1 if (@d = /(.oo)/g)
DB<3> x @d
0 'foo'
1 'too'


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives