CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Splitting a scalar

 



Troy
Novice

May 21, 2002, 8:37 AM

Post #1 of 7 (982 views)
Splitting a scalar Can't Post

I have a scalar which I want to split into an array containing portions of the original scalar, eg.

I wish to split the scalar:
$346473837209

into ar array, so that element[0] contains the first three numbers, element[1] contains the next three, etc, ie:
@array[0] = 346
@array[1] = 473
@array[2] = 837
@array[3] = 209
etc

TIA - Troy


fashimpaur
User / Moderator

May 22, 2002, 6:50 AM

Post #2 of 7 (975 views)
Re: [Troy] Splitting a scalar [In reply to] Can't Post

Troy,

I am not sure what you are asking. Are you saying that you want to split the variable name for the scalar? Or are you saying you have a scalar variable containing data that you want split?

i.e.

Code
 $x = "$346473837209";# or ... 


$346473837209 = "some data";




Please clarify so we can assist.
Dennis

$a="c323745335d3221214b364d545".
"a362532582521254c3640504c3729".
"2f493759214b3635554c3040606a0",
print unpack"u*",pack "h*",$a,"\n\n";


rGeoffrey
User / Moderator

May 22, 2002, 8:28 AM

Post #3 of 7 (971 views)
Re: [Troy] Splitting a scalar [In reply to] Can't Post

If you have a string which represents dollars and want to display them in groups of three digits separated with commas you could try this...


Code
foreach ('$346473837209', '$123456', '$12345', '$1234', '$12') { 
my $source = $_;
my (@output);

print "original string is '$source'\n";
$source =~ s/^\$//;
while ($source) {
if (length ($source) < 3) {
unshift (@output, $source);
last;
}
$source =~ s/(...)$//;
unshift (@output, $1);
}
print 'final string could be $', join (',', @output), "\n";
}


Which would print these lines...


Code
original string is   '$346473837209' 
final string could be $346,473,837,209
original string is '$123456'
final string could be $123,456
original string is '$12345'
final string could be $12,345
original string is '$1234'
final string could be $1,234
original string is '$12'
final string could be $12


I guessed that you were dealing with dollars so I made the choice to grab the digits from the right end rather than the left. It would not make a difference in your example with 12 digits, but if you look at the other examples you will see where the answer would be different. This might not be exactly what you need, but it should be enough for you to get there from here.

PS. The strange looking $source = $_ is there because perl was complaining about the substitutions trying to change a read only variable and this fixed it. I am sure there is a more elegant solution, but that is left as an exercise for the reader.


mhx
Enthusiast / Moderator

May 22, 2002, 11:26 AM

Post #4 of 7 (967 views)
Re: [rGeoffrey] Splitting a scalar [In reply to] Can't Post

... or with some regex magic: Wink


Code
@values = qw( $346473837209 $123456 $12345 $1234 $12 ); 

for( @values ) {
print;
s/\d+?(?=(?:\d{3})+$)/$&,/g;
print " => $_\n";
}


-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



fashimpaur
User / Moderator

May 22, 2002, 11:44 AM

Post #5 of 7 (964 views)
Re: [mhx] Splitting a scalar [In reply to] Can't Post

mhx,

In an effort to further my understanding of regex's and to aid the
persons which this section of the forum was designed for (i.e. Beginner)
could you please explain your 'regex magic'? Crazy

Thanks,
Dennis

$a="c323745335d3221214b364d545".
"a362532582521254c3640504c3729".
"2f493759214b3635554c3040606a0",
print unpack"u*",pack "h*",$a,"\n\n";


mhx
Enthusiast / Moderator

May 22, 2002, 12:42 PM

Post #6 of 7 (956 views)
Re: [fashimpaur] Splitting a scalar [In reply to] Can't Post

Sure, Dennis, I'll explain it! Smile

I think the loop isn't that interesting, so I'll limit my explanations to the regex I'm using.

The regex looks a lot more complicated than it actually is. The only "advanced" feature that I've used is called a "lookahead assertion". But I'll start at the beginning!


Code
s/\d+?(?=(?:\d{3})+$)/$&,/g;


This part will match at least one, but as few as possible digits. Since there's no anchor (like ^) at the beginning of the regex, any non-digit characters(especially the dollar sign) are skipped.


Code
s/\d+?(?=(?:\d{3})+$)/$&,/g;


Next comes the so-called zero-width positive lookahead assertion. This means:

a) it doesn't contribute any characters to the match (zero-width)
b) the assertion must be successful (positive)
c) the expression is looking "ahead" (to the right) from exactly this position

Here's a quick example:


Code
$a = 'bar'; 
$a =~ s/b(?=ar)/c/;
print "$a\n"; # prints 'car'
$a =~ s/c(?=r)/b/;
print "$a\n"; # prints 'car' again


In the first regex, a 'b' is replaced by a 'c' if it's immediately followed by 'ar'. Note that the 'ar', although it is required for the expression to match, is not part of the matched string. Only 'b' is matched and replaced by 'c'.
In the second regex (note that $a is now 'car'), the 'c' is not matched because it's not followed immediately by an 'r', which causes the lookahead assertion to fail.

But back to the original regex, let's have a closer look at the pattern inside the lookahead:


Code
s/\d+?(?=(?:\d{3})+$)/$&,/g;


This part will match exactly three digits. This subpattern is grouped using non-capturing parentheses, so we can later quantify that pattern again. I used the non-capturing parens because they're faster than the capturing ones and because I don't need the captured content. But we could also have written


Code
s/\d+?(?=(\d{3})+$)/$&,/g;


This would have worked in exactly the same way. The next step is the key to this solution:


Code
s/\d+?(?=(?:\d{3})+$)/$&,/g;


The lookahead asserts one or more sequences of three digits immediately followed by the end of the string.


Code
s/\d+?(?=(?:\d{3})+$)/$&,/g;


So, since this is a global search-and-replace-operation, the regex engine will stop at any position in the string where a multiple of three digits are left until the end of the string and replace the matched characters by themselves and a comma.

And that's all the "magic" there is about this regex!

But wait, there's still room for improvement! We also could have solved the problem using an additional positive lookbehind assertion:


Code
s/(?<=\d)(?=(?:\d{3})+$)/,/g;


This makes it even nicer to read (at least for me) because you can more easily see that there's only a comma being inserted. (And it's about 30% faster, too.) But I'll leave the explanation on how this exactly works as an excercise to the reader. Wink

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



fashimpaur
User / Moderator

May 23, 2002, 9:14 AM

Post #7 of 7 (945 views)
Re: [mhx] Splitting a scalar [In reply to] Can't Post

Mhx,

That was a most informative explanation of how a regex works. As I said
I am constantly looking to improve this area in my coding skills. Regex's
are pretty mystical and explanations really help me to clarify the confusion.

Thank you so much.
Dennis

$a="c323745335d3221214b364d545".
"a362532582521254c3640504c3729".
"2f493759214b3635554c3040606a0",
print unpack"u*",pack "h*",$a,"\n\n";

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives