CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Copying block of lines satisfying criteria

 



Alkass
Novice

Jan 28, 2011, 8:38 AM

Post #1 of 13 (2930 views)
Copying block of lines satisfying criteria Can't Post

Dear experts

I am having this file, with blocks like these


Code
<event> 
8 1 0.1118800E-04 0.1709000E+03 0.7546772E-02 0.1182204E+00
2 -1 0 0 501 0 0.00000000000E+00 0.00000000000E+00 0.16805428396E+03 0.16805428396E+03 0.00000000000E+00 0. -1.
-1 -1 0 0 0 501 0.00000000000E+00 0.00000000000E+00 -0.86931545385E+02 0.86931545385E+02 0.00000000000E+00 0. 1.
6 2 1 2 502 0 -0.51483403662E+02 -0.21239600002E+02 0.49178667843E+02 0.19061056145E+03 0.17553399473E+03 0. 0.
24 2 3 3 0 0 0.99049100820E+01 -0.67551605018E+02 0.11851367664E+02 0.10500247649E+03 0.78890674788E+02 0. 0.
-5 1 1 2 0 502 0.51483403662E+02 0.21239600002E+02 0.31944070728E+02 0.64375267887E+02 0.47000000000E+01 0. 1.
-15 1 4 4 0 0 0.18959447355E+02 -0.37702518132E+02 0.43453286286E+02 0.60599392204E+02 0.17770000000E+01 0. 1.
16 1 4 4 0 0 -0.90545372735E+01 -0.29849086886E+02 -0.31601918621E+02 0.44403084281E+02 0.00000000000E+00 0. -1.
5 1 3 3 502 0 -0.61388313744E+02 0.46312005016E+02 0.37327300179E+02 0.85608084967E+02 0.47000000000E+01 0. -1.
</event>


What I need, is to look into the lines for the first column being "15" or "-15" and the last one being "1" or "-1" like in the line
-15 1 4 4 0 0 0.18959447355E+02 -0.37702518132E+02 0.43453286286E+02 0.60599392204E+02 0.17770000000E+01 0. 1.
and then copy the whole block between <event> and </event> into another file and then move to the next block...

Is there any example ?

Thanks in advance!


Karazam
User

Jan 28, 2011, 12:24 PM

Post #2 of 13 (2923 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

Perl's record input separator is stored in the variable $/. By default it's set to newline, but you can change it to what suits the occasion.

The tricky part in your case could be the match. The anchors ^ and $ matches the start and end of the whole string, delimited by the event tags in this case, so they won't be useful here.
The following works on the snippet you posted at least.


Code
$/ = '</event>'; 
while (<>) {
print if /\s{6,7}-?15.+\s{1,2}-?1\.\s*\n/;
}


Hope this helps.


Alkass
Novice

Jan 28, 2011, 12:45 PM

Post #3 of 13 (2918 views)
Re: [Karazam] Copying block of lines satisfying criteria [In reply to] Can't Post


In Reply To
Perl's record input separator is stored in the variable $/. By default it's set to newline, but you can change it to what suits the occasion.

The tricky part in your case could be the match. The anchors ^ and $ matches the start and end of the whole string, delimited by the event tags in this case, so they won't be useful here.
The following works on the snippet you posted at least.


Code
$/ = '</event>'; 
while (<>) {
print if /\s{6,7}-?15.+\s{1,2}-?1\.\s*\n/;
}


Hope this helps.



Thanks a lot! You are a day saver! this works as expected!

Thanks!


Alkass
Novice

Jan 28, 2011, 12:53 PM

Post #4 of 13 (2916 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

Sorry for wasting your time again, but how could I possibly into the same search for "15" || "-15" ?

Thanks!


BillKSmith
Veteran

Jan 28, 2011, 3:05 PM

Post #5 of 13 (2910 views)
Re: [Karazam] Copying block of lines satisfying criteria [In reply to] Can't Post

The /m option solves your problem.


Code
    print if 
/
^\s* # Optional white space at start of line
[-+]?15 # 15
[-\d\sE+.]+ # numeric data
[-+]?1\.? # final 1
\s*$ # Optional white space at end of line
/xms;

Good Luck,
Bill


Alkass
Novice

Jan 28, 2011, 6:40 PM

Post #6 of 13 (2903 views)
Re: [BillKSmith] Copying block of lines satisfying criteria [In reply to] Can't Post


In Reply To
The /m option solves your problem.


Code
    print if 
/
^\s* # Optional white space at start of line
[-+]?15 # 15
[-\d\sE+.]+ # numeric data
[-+]?1\.? # final 1
\s*$ # Optional white space at end of line
/xms;



Thanks for you effort. Neverthess, if I change the [-+]?1\.? to [+]?1\.? , then this is not working properly - it also gives me the wrong combination "+-15" "-1" . Also, what would be a more flexible syntax for example "15" || "-15" || "13" || "-13" || "11" || "-11"

Thanks for your time for my naive questions


BillKSmith
Veteran

Jan 28, 2011, 8:06 PM

Post #7 of 13 (2899 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

Sorry about the confusion. I only meant to show that the /m modifier changes the meaning of the anchors "^" and "$". With it, the the regular expression can refer to the start and end of a line.

The RE that I posted has the same logic as Karazam's with regard to the 15 and the 1, but it looks for them at the beginning and end of a line rather than depending on the pattern of whitespace. Both allow any combination of positive or negative 15 at the start and a positive or negative 1 at the end. Both have been tested successfully with the sample data. Failures with other data may be due to the other data in the line.

Your change

Quote
[-+]?1\.? to [+]?1\.?

is
valid, but poor syntax. It would not allow -1 at all.

I do not understand your question about flexible syntax. If you want to change the RE to allow 11 or 13 as well as 15, replace the 5 of the 15 with the class [135].


Code
         [-+]?1[135] # optionally negative, 11,13, or 15


Good Luck,
Bill


Alkass
Novice

Jan 29, 2011, 2:29 AM

Post #8 of 13 (2893 views)
Re: [BillKSmith] Copying block of lines satisfying criteria [In reply to] Can't Post

 
Thank you all for your helpful examples...So, I see that what actually need is "+-11" || "+-13" || "+-15" BUT IFF (only and if only;-) the last is "1" ( does the above syntax will work ?)

the

Code
 ^\s*        # Optional white space at start of line  
[-+]?1[135] # 15
[-\d\sE+.]+ # numeric data
[-]?1\.? # final 1
\s*$ # Optional white space at end of line
/xms;


returns both +1 and -1 as well....
Hope you will tolerate me once more...

Cheers


(This post was edited by Alkass on Jan 29, 2011, 2:37 AM)


BillKSmith
Veteran

Jan 29, 2011, 8:40 AM

Post #9 of 13 (2886 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

I was surprised that your code did not work. Investigation showed that the sign of the final 1 is consumed by the previous line of the pattern. To prevent matching '-1', you must explicitly exclude the '-' in the last line of the pattern.


Code
    print if 
/
^\s* # Optional white space at start of line
[-+]?1[135] # 15
[-\d\sE+.]+ # numeric data
\s[^-]1\.? # final 1 (not negative)
\s*$ # Optional white space at end of line
/xms
}



The extra whitespace character prevents matching a two digit number. It may be necessary to remove it if real data does not always have that extra space.
Good Luck,
Bill


Alkass
Novice

Jan 30, 2011, 3:49 AM

Post #10 of 13 (2857 views)
Re: [BillKSmith] Copying block of lines satisfying criteria [In reply to] Can't Post

Hi again

Thanks for all of your time...So, apparently it looks like as the "first" selection depends on the sign +/- of 13||15 etc, I see that it selects also "1" and "-1"... So maube my question was wrong, as the code should look on the first column (+- 13| 11 | 15) and IF the (-1) on the last column THEN copy the block AT exactly the same line when the "15"etc was found

for example, using the above syntax, I have both these block copied

Code
<event> 
8 1 0.1118800E-04 0.1709000E+03 0.7546772E-02 0.1182204E+00
2 -1 0 0 501 0 0.00000000000E+00 0.00000000000E+00 0.17113032152E+04 0.17113032152E+04 0.00000000000E+00 0. -1.
-1 -1 0 0 0 501 0.00000000000E+00 0.00000000000E+00 -0.31050490700E+02 0.31050490700E+02 0.00000000000E+00 0. 1.
6 2 1 2 502 0 -0.18342013328E+03 -0.67308342811E+02 0.87044899872E+03 0.90930057278E+03 0.17599066450E+03 0. 0.
24 2 3 3 0 0 -0.44493974196E+02 -0.46644882636E+02 0.20445984582E+03 0.22864566900E+03 0.79495626199E+02 0. 0.
-5 1 1 2 0 502 0.18342013328E+03 0.67308342811E+02 0.80980372574E+03 0.83305313308E+03 0.47000000000E+01 0. 1.
-15 1 4 4 0 0 -0.61269954060E+02 -0.29782209075E+02 0.96755177553E+02 0.11834571965E+03 0.17770000000E+01 0. 1.
16 1 4 4 0 0 0.16775979864E+02 -0.16862673561E+02 0.10770466826E+03 0.11029994935E+03 0.00000000000E+00 0. -1.
5 1 3 3 502 0 -0.13892615908E+03 -0.20663460175E+02 0.66598915290E+03 0.68065490378E+03 0.47000000000E+01 0. -1.
</event>
<event>
8 2 0.1118800E-04 0.1709000E+03 0.7546772E-02 0.1182204E+00
1 -1 0 0 501 0 0.00000000000E+00 0.00000000000E+00 0.37830007183E+03 0.37830007183E+03 0.00000000000E+00 0. -1.
-2 -1 0 0 0 501 0.00000000000E+00 0.00000000000E+00 -0.48172328166E+02 0.48172328166E+02 0.00000000000E+00 0. 1.
-6 2 1 2 0 502 -0.14296691926E+02 -0.59124341957E+02 0.15507645435E+03 0.24109406496E+03 0.17429158973E+03 0. 0.
-24 2 3 3 0 0 0.33979125153E+02 -0.40302550160E+02 0.16276675240E+03 0.18850086154E+03 0.79124475474E+02 0. 0.
5 1 1 2 502 0 0.14296691926E+02 0.59124341957E+02 0.17505128931E+03 0.18537833504E+03 0.47000000000E+01 0. -1.
15 1 4 4 0 0 0.97391839066E+01 0.21656486010E+02 0.72870798783E+02 0.76662677580E+02 0.17770000000E+01 0. -1.
-16 1 4 4 0 0 0.24239941247E+02 -0.61959036171E+02 0.89895953620E+02 0.11183818396E+03 0.00000000000E+00 0. 1.
-5 1 3 3 0 502 -0.48275817079E+02 -0.18821791797E+02 -0.76902980526E+01 0.52593203413E+02 0.47000000000E+01 0. 1.
</event>



Anyway, how it would be possible to put a line at the end of the file counting how many "block" were copied ie # Number of Events : 10000

thanks again!


(This post was edited by Alkass on Jan 30, 2011, 4:53 AM)


BillKSmith
Veteran

Jan 30, 2011, 3:43 PM

Post #11 of 13 (2835 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

Requiring the minus sign is easier than excluding it! Give it a try.



In order to count your prints, you will have to use an if statement rather than an if modifier. Put your print statement and the counter in the if block. (Remember to declare and initialize your counter outside the loop.)
Good Luck,
Bill


Alkass
Novice

Jan 30, 2011, 4:41 PM

Post #12 of 13 (2831 views)
Re: [BillKSmith] Copying block of lines satisfying criteria [In reply to] Can't Post

this is really weird! if I do

Code
     ^\s*        # Optional white space at start of line  
[=~+]?1[135] # 15
[-\d\sE+.]+ # numeric data
[=~-]?1.\.? # final 1 (not negative)
\s*$ # Optional white space at end of line


it will return correctly only those block requiring the + for the 1rst column and the -1 for the last one - If I do


Code
     ^\s*        # Optional white space at start of line  
[=~-]?1[135] # 15
[-\d\sE+.]+ # numeric data
[=~+]?1.\.? # final 1 (not negative)
\s*$ # Optional white space at end of line


it collects ALL the blocks...

Can you think a reason why ???


BillKSmith
Veteran

Jan 30, 2011, 6:34 PM

Post #13 of 13 (2821 views)
Re: [Alkass] Copying block of lines satisfying criteria [In reply to] Can't Post

You brought back the problem of the greedy operator at the end of the previous pattern. The minus sign on the 1 is mandatory. Get rid of the question mark.

I do not understand why you are using character classes for the signs. The = and ~ characters only confuse the issue.

Try to keep you comments up to date. That is why we use the /x modifier. Yea, I know I am also guilty of not doing that.Frown

Your code expects some character between the final -1 and its decimal point. There never is any. Your font may have misled you.


Code
    print if 
/
^\s* # Optional white space at start of line
\+?1[135] # poitive or negative 11, 13, or 15
[-\d\sE+.]+ # numeric data
\-1\.? # final -1
\s*$ # Optional white space at end of line
/xms;

Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives