CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Parsing

 



kapab07
Novice

Mar 26, 2013, 3:02 PM

Post #1 of 9 (831 views)
Parsing Can't Post

Hi,
Can someone tell me what I am doing wrong here.
I writing a small script to read and output all the N1 and its coordinates from my pdb file. The script below yields no output. What is wrong with it?
Here is the script:

#!/usr/bin/perl

open(FILE,$ARGV[0]);

$file = <FILE>;

foreach my $l ($file){

if ($l=~/^ATOM/ && $atname=~/N1/){
my $atname = substr($l, 66, 1);
my $X = substr($l, 27, 6);
my $Y = substr($l, 33, 6);
my $Z = substr($l, 42, 6);
}
}
print ($atname, $X, $Y, $Z);

exit;

When I run it it gives no output.


(This post was edited by kapab07 on Mar 26, 2013, 3:16 PM)


FishMonger
Veteran / Moderator

Mar 26, 2013, 5:23 PM

Post #2 of 9 (822 views)
Re: [kapab07] Parsing [In reply to] Can't Post

Try putting your print statement in the if {..} block, just after the var assignments.


lightspd
Novice

Mar 26, 2013, 5:49 PM

Post #3 of 9 (818 views)
Re: [kapab07] Parsing [In reply to] Can't Post

Another hint, you are using a variable before declaring it.


Code
if ($l=~/^ATOM/ && $atname=~/N1/){  
my $atname = substr($l, 66, 1);


change that, to;

Code
my $atname = substr($l, 66, 1); 
if ($l=~/^ATOM/ && $atname=~/N1/){



kapab07
Novice

Mar 26, 2013, 5:59 PM

Post #4 of 9 (817 views)
Re: [FishMonger] Parsing [In reply to] Can't Post

Thanks a lot


Kenosis
User

Mar 26, 2013, 6:01 PM

Post #5 of 9 (814 views)
Re: [kapab07] Parsing [In reply to] Can't Post

Your script is problematic in a number of areas. The most immediate which guarantees no printing is the following:


Code
if ($l=~/^ATOM/ && $atname=~/N1/){ ...


The variable $atname has not been initialized anywhere, so the overall expression will always evaluate to false.

Here are some other items to keep in mind:

Always:

Code
use strict; 
use warnings;


at the top of your script. Use the three-argument for of open, i.e.:


Code
open my $fh, '<', $ARGV[0] or die $!;


The above also demonstrates using lexical file handles and handling open failures.

The following:


Code
$file = <FILE>;


will only read the first line of the file into the variable $file . Then, you have $file as the iterable in your foreach loop, but you should have the file handle there.

Finally, you don't need exit at the end of your script, as it will automatically exit when finished.

Since you're sending the script a file from the command line, you can do the following:


Code
#!/usr/bin/env perl 
use strict;
use warnings;

while (<>) {
my $atname = substr( $_, 66, 1 );

if ( /^ATOM/ and $atname =~ /N1/ ) {
my $X = substr( $_, 27, 6 );
my $Y = substr( $_, 33, 6 );
my $Z = substr( $_, 42, 6 );
print "$atname, $X, $Y, $Z\n";
}
}


This lets Perl open and close the file. Each of the file's lines are assigned to $_, the default scalar.

I don't have your dataset, but I believe you intended to say the above.

Hope this helps!


(This post was edited by Kenosis on Mar 26, 2013, 6:04 PM)


kapab07
Novice

Mar 26, 2013, 6:13 PM

Post #6 of 9 (808 views)
Re: [lightspd] Parsing [In reply to] Can't Post

Thanks a lot


kapab07
Novice

Mar 27, 2013, 4:31 AM

Post #7 of 9 (799 views)
Re: [Kenosis] Parsing [In reply to] Can't Post

I really appreciate your helps. Thanks a lot.
Here is part of my pdb file:
ATOM 71 N2 DG A 3 -0.228 2.346 17.999 1.00 2.92 N
ATOM 72 N3 DG A 3 0.745 3.531 16.259 1.00 7.08 N
ATOM 73 C4 DG A 3 0.788 3.626 14.894 1.00 12.36 C
ATOM 74 C5 DG A 3 0.093 2.891 13.972 1.00 9.28 C
ATOM 75 C6 DG A 3 -0.792 1.917 14.408 1.00 6.67 C
ATOM 76 O6 DG A 3 -1.528 1.253 13.693 1.00 11.90 O
ATOM 77 N7 DG A 3 0.474 3.204 12.690 1.00 12.03 N
ATOM 78 C8 DG A 3 1.389 4.110 12.865 1.00 10.83 C
ATOM 79 N9 DG A 3 1.615 4.437 14.168 1.00 11.63 N
ATOM 80 OP1 DG A 3 5.469 7.004 9.830 1.00 21.52 O
ATOM 81 OP2 DG A 3 3.681 5.183 9.686 1.00 14.50 O1-
ATOM 82 H01 DG A 3 1.930 4.570 12.038 1.00 10.83 H
ATOM 83 H02 DG A 3 5.731 7.322 12.859 1.00 17.61 H
ATOM 84 H03 DG A 3 4.183 7.713 14.723 1.00 17.62 H
ATOM 85 H04 DG A 3 4.873 4.830 13.864 1.00 16.31 H
ATOM 86 H05 DG A 3 3.554 5.876 16.434 1.00 17.71 H
ATOM 87 H06 DG A 3 1.461 6.040 15.377 1.00 16.27 H
ATOM 88 H07 DG A 3 -1.457 1.134 16.205 1.00 7.48 H
ATOM 89 H08 DG A 3 -0.859 1.627 18.322 1.00 2.92 H
ATOM 90 H09 DG A 3 0.305 2.887 18.665 1.00 2.92 H
ATOM 91 H10 DG A 3 4.261 8.151 12.338 1.00 17.61 H
ATOM 92 H11 DG A 3 3.491 4.236 15.859 1.00 17.71 H
ATOM 93 P DG A 4 6.647 4.633 15.976 1.00 19.10 P
ATOM 94 C5' DG A 4 5.832 4.777 18.613 1.00 14.70 C
ATOM 95 O5' DG A 4 5.837 4.160 17.304 1.00 19.69 O
ATOM 96 C4' DG A 4 5.349 3.746 19.615 1.00 15.74 C
ATOM 97 O4' DG A 4 4.014 3.339 19.320 1.00 15.71 O
ATOM 98 C3' DG A 4 6.144 2.446 19.549 1.00 14.86 C
ATOM 99 O3' DG A 4 7.442 2.553 20.185 1.00 20.22 O
ATOM 100 C2' DG A 4 5.194 1.467 20.191 1.00 13.42 C
ATOM 101 C1' DG A 4 3.886 1.904 19.582 1.00 13.10 C
ATOM 102 N1 DG A 4 0.906 -1.749 17.502 1.00 8.27 N
ATOM 103 C2 DG A 4 1.340 -1.630 18.818 1.00 8.31 C
ATOM 104 N2 DG A 4 0.852 -2.494 19.717 1.00 6.42 N
ATOM 105 N3 DG A 4 2.218 -0.681 19.223 1.00 9.10 N
ATOM 106 C4 DG A 4 2.629 0.144 18.209 1.00 10.34 C
ATOM 107 C5 DG A 4 2.252 0.091 16.894 1.00 11.16 C
ATOM 108 C6 DG A 4 1.310 -0.920 16.456 1.00 9.75 C
ATOM 109 O6 DG A 4 0.888 -1.151 15.321 1.00 9.97 O
ATOM 110 N7 DG A 4 2.902 1.118 16.167 1.00 11.07 N
ATOM 111 C8 DG A 4 3.617 1.742 17.054 1.00 8.09 C
ATOM 112 N9 DG A 4 3.496 1.232 18.319 1.00 11.82 N
ATOM 113 OP1 DG A 4 7.834 5.387 16.410 1.00 21.78 O
ATOM 114 OP2 DG A 4 6.811 3.466 15.083 1.00 20.41 O1-
ATOM 115 H01 DG A 4 4.254 2.593 16.814 1.00 8.09 H
ATOM 116 H02 DG A 4 6.839 5.103 18.873 1.00 14.70 H

I need to get N1, N2, ..., N10 and their coordinates. I try your scripts and I get these:

substr outside of string at ../readfile.pl line 8, <FILE> line 1079.
substr outside of string at ../readfile.pl line 8, <FILE> line 1080.
substr outside of string at ../readfile.pl line 8, <FILE> line 1081.
substr outside of string at ../readfile.pl line 8, <FILE> line 1082.
substr outside of string at ../readfile.pl line 8, <FILE> line 1083.
substr outside of string at ../readfile.pl line 8, <FILE> line 1084.
substr outside of string at ../readfile.pl line 8, <FILE> line 1085.
substr outside of string at ../readfile.pl line 8, <FILE> line 1086.
substr outside of string at ../readfile.pl line 8, <FILE> line 1087.
substr outside of string at ../readfile.pl line 8, <FILE> line 1088.
substr outside of string at ../readfile.pl line 8, <FILE> line 1089.
substr outside of string at ../readfile.pl line 8, <FILE> line 1090.
substr outside of string at ../readfile.pl line 8, <FILE> line 1091.
substr outside of string at ../readfile.pl line 8, <FILE> line 1092.
substr outside of string at ../readfile.pl line 8, <FILE> line 1093.
substr outside of string at ../readfile.pl line 8, <FILE> line 1094.
substr outside of string at ../readfile.pl line 8, <FILE> line 1095.
substr outside of string at ../readfile.pl line 8, <FILE> line 1096.
substr outside of string at ../readfile.pl line 8, <FILE> line 1097.
substr outside of string at ../readfile.pl line 8, <FILE> line 1098.
substr outside of string at ../readfile.pl line 8, <FILE> line 1099.
substr outside of string at ../readfile.pl line 8, <FILE> line 1100

It does not output the N. Can you please help again. I try everything last night but unsuccessfully.
I appreciate your kindness.


Kenosis
User

Mar 27, 2013, 8:47 AM

Post #8 of 9 (790 views)
Re: [kapab07] Parsing [In reply to] Can't Post

pdb ATOM fields and records have fixed lengths, so using substr works well for parsing these. However, the ATOM record length is 80 and the data you've shared falls short of this, so the "substr outside of string" errors are produced. Your ATOM records should look like this:


Code
ATOM    145  N   VAL A  25      32.433  16.336  57.540  1.00 11.92      A1   N 
ATOM 146 CA VAL A 25 31.132 16.439 58.160 1.00 11.85 A1 C
ATOM 147 C VAL A 25 30.447 15.105 58.363 1.00 12.34 A1 C
ATOM 148 O VAL A 25 29.520 15.059 59.174 1.00 15.65 A1 O
ATOM 149 CB AVAL A 25 30.385 17.437 57.230 0.28 13.88 A1 C
ATOM 150 CB BVAL A 25 30.166 17.399 57.373 0.72 15.41 A1 C
ATOM 151 CG1AVAL A 25 28.870 17.401 57.336 0.28 12.64 A1 C
ATOM 152 CG1BVAL A 25 30.805 18.788 57.449 0.72 15.11 A1 C
ATOM 153 CG2AVAL A 25 30.835 18.826 57.661 0.28 13.58 A1 C
ATOM 154 CG2BVAL A 25 29.909 16.996 55.922 0.72 13.25 A1 C


Source: Coordinate File Description (PDB Format)


kapab07
Novice

Mar 27, 2013, 12:07 PM

Post #9 of 9 (784 views)
Re: [Kenosis] Parsing [In reply to] Can't Post

Great,I appreciate your helps.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives