CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Reading in Contents, then Formatting out: Importan

 



BrightNail
Novice

Jul 26, 2001, 8:34 PM

Post #1 of 9 (1408 views)
Reading in Contents, then Formatting out: Importan Can't Post

Hey all, (repost from advanced sorry)

I have a need to facilitate something. I get a list of entries everyday and I need a way to paste these entries into a textbox or something and then have a perl script format it all, rather than me having to code html all over it. There are like 20 entries per list, here is a ex. of two entries from the list.

_____________________________
John Doe

Silver Hair
Wednesday, September 19 at 8 PM
{a link is here}
______________________________

Mary Bonham

Blue House
Saturday, September 23 at 7:30 PM
{a link is here}


Now, the line is the seperator, I don't need that but I need it to recognize a new entry. How do I grab the whole list of 20 or so entries and then outputing the entries in the proper format....like so

{a link here}John Doe{/close link} {br}
{a link here}Wednesday, September 19 at 8 PM{/close link}{br} with Silver Hair
{br}{br}
{a link here}Mary Bonham{/close link} {br}
{a link here}Saturday, September 23 at 7:30 PM{/close link}{br} with Blue House{br}{br}

I used the incorrect brackets so you could see how the html would output. the bracket wih the {a link is here} is some URL that is given which needs to be used as shown........

Can anyone help me out with this? I am assuming to create a temp file, use the line as a seperator, write each entry on a line (with each attribute seperator by a tab or something), close temp file, open temp file, then read the contents out.....but HOW? ahhh,

thanks,










mhx
Enthusiast / Moderator

Jul 26, 2001, 9:44 PM

Post #2 of 9 (1402 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

The following code does what you want (at least I hope so, but I tested it ;-).
For demonstration, I read from the DATA filehandle that accesses the part of the script located after __DATA__. You can use any other file you like, just open it and change the filehandle that is read from.

Code
#!/bin/perl -w 
use strict;

print map "$_->[3]$_->[0](/a)(br)\n".
"$_->[3]$_->[2](/a)(br)with $_->[1](br)(br)\n",
map [grep !/^\s*$/, split /(?:\r?\n)+/],
grep !/^\s*$/, split /_+/, do {local $/=undef; <DATA>};

__DATA__

_____________________________
John Doe

Silver Hair
Wednesday, September 19 at 8 PM
(a link is here)
______________________________

Mary Bonham

Blue House
Saturday, September 23 at 7:30 PM
(a link is here)

Hope this helps.

-- Marcus


Code
s$$ab21b8d15c3d97bd6317286d$;$"=547269736;split'i',join$,,map{chr(($*+= 
($">>=1)&1?-hex:hex)+0140)}/./g;$"=chr$";s;.;\u$&;for@_[0,2];print"@_,"



Jasmine
Administrator

Jul 26, 2001, 10:09 PM

Post #3 of 9 (1402 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

Here's my go at it.

Code
use CGI qw /:standard/; 
use strict;

my $file = <<EOF;
_____________________________
John Doe

Silver Hair
Wednesday, September 19 at 8 PM
{a link is here}
______________________________
Mary Bonham

Blue House
Saturday, September 23 at 7:30 PM
{a link is here}

EOF


print
map {
a( { -href => $_->[5] }, $_->[1] ), "\n",
a( { -href => $_->[5] }, $_->[4] ), "\n",
" with $_->[3]\n \n",
br(), br()
if $_->[1]
}

map { [ split /\n/m ] }
map { split /^_+$/m }
$file;

Now, for the explanation.


Code
use CGI qw /:standard/;

CGI nut am I. Never leave home without it.


Code
use strict;

Always.


Code
my $file = <<EOF; 
_____________________________
John Doe

Silver Hair
Wednesday, September 19 at 8 PM
{a link is here}
______________________________
Mary Bonham

Blue House
Saturday, September 23 at 7:30 PM
{a link is here}

EOF

Your data, modified so there's no extra line before Mary (I'm hoping that was a copy/paste error).

The only way to explain the next part is taking it line by line, backwards:


Code
    $file;

This is was the line before works on.



Code
    map { split /^_+$/m  }

This takes $file and splits it on lines that begin and end with _s and contain no other characters other than _. Leading or trailing spaces will break this piece of code.

The results of this piece of code, which contains one complete section of your data (from name to link) is then passed to the preceeding line, which is...



Code
    map { [ split /\n/m ]  }

This splits the section of your data into an array (splitting on new lines, which is why it's important to keep the vertical spacing consistent). It creates a temporary, anonymous array to store the split section.



Code
print 
map {
a( { -href => $_->[5] }, $_->[1] ), "\n",
a( { -href => $_->[5] }, $_->[4] ), "\n",
" with $_->[3]\n \n",
br(), br()
if $_->[1]
}

This section formats then prints the data in the format you noted. The if $_->[1] in case there's an blank entry so you don't output blank data (complete with html code).


I'm sure there's a better way to do it, but it sure beats creating lots of temp files as you were considering :)

Hope this helps!



Jasmine
Administrator

Jul 26, 2001, 10:21 PM

Post #4 of 9 (1402 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

*pout* mhx always beats me to it Tongue

I write, test, test, then document. By that time, mhx is done (so I'm slow... what of it? ;). He never lets me have any fun anymore Wink I gotta learn to see if he's online before I even start trying to post ;)

Of course his is more flexible than mine (which has stricter data format requirements), but you'll probably want to change


Code
print map "$_->[3]$_->[0](/a)(br)\n". 
"$_->[3]$_->[2](/a)(br)with $_->[1](br)(br)\n",

to

Code
print map <<EOF, 
<A HREF="$_->[3]">$_->[0]</A><BR>
<A HREF="$_->[3]">$_->[2]</A><BR>
with $_->[1]<BR><BR> \n
EOF

so the links and html code works.



BrightNail
Novice

Jul 27, 2001, 10:23 AM

Post #5 of 9 (1393 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

Hey,
I tried what everyone posted and I get internal server errors.
It just doesn't work. I can't figure this out.
The code you guys are using is A LOT more streamlined than I would have used, so much of it I don't recognize.............Can you email me the exact PERL script that was used. I am cutting and pasting what is here, but its bombing out....

venicesrfr@aol.com

I really appreciate everyones input...thanks,



mhx
Enthusiast / Moderator

Jul 27, 2001, 1:00 PM

Post #6 of 9 (1388 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

Hi,

first, the code we posted were our actual Perl scripts that worked really fine. (I hope Jasmine won't mind that I'm talking in her name here, but I'm quite sure this is her tested script ;-) Of course you cannot upload my script and expect it to work, since it isn't laid out as a CGI script. It's laid out to run from the shell or command line.

Since this is the intermediate forum, I was a bit rare with explanation :) Actually, your problem doesn't seem to be that our code isn't working, but how you can embed our code in your script. But this should be quite easy with some basic Perl knowledge and even if you don't understand my code. (I hope you understood Jasmine's code, because her explanation was very good and accurate.)

Anyway, just to explain to you what my code does (so you know what it does not and can figure out how to extend the script), here comes the full script again, just a bit reformatted.

Code
#!/bin/perl -w 
use strict;

print map "$_->[3]$_->[0](/a)(br)\n".
"$_->[3]$_->[2](/a)(br)with $_->[1](br)(br)\n",
map [
grep !/^\s*$/,
split /(?:\r?\n)+/
],
grep !/^\s*$/,
split /_+/,
do {
local $/=undef;
<DATA>
};

__DATA__

_____________________________
John Doe

Silver Hair
Wednesday, September 19 at 8 PM
(a link is here)

______________________________

Mary Bonham

Blue House
Saturday, September 23 at 7:30 PM
(a link is here)

The header should be clear from Jasmine's explanation:

Code
#!/bin/perl -w 
use strict;

It just specifies the path to your perl interpreter executable, turns Perl's warnings on (which is a must) and turns Perl's strict mode on (which is a must).
The important part, as you might just have figured out, is

Code
print map "$_->[3]$_->[0](/a)(br)\n". 
"$_->[3]$_->[2](/a)(br)with $_->[1](br)(br)\n",
map [
grep !/^\s*$/,
split /(?:\r?\n)+/
],
grep !/^\s*$/,
split /_+/,
do {
local $/=undef;
<DATA>
};

To understand this, let's go from the bottom up, because that's also the way Perl evaluates it. If you see lots of map's and grep's and split's you almost always have to start at the end to understand ;-)

Code
      do { 
local $/=undef;
<DATA>
};

This block will read the whole content from DATA. As I pointed out, DATA is a special filehandle that allows you to read directly from a special section (everything following __DATA__) in the script. You could just use any other filehandle than DATA, if you would read from a file called 'address.txt', you would open that file

Code
open ADDRESS, 'address.txt' or die "cannot open address.txt: $!\n";

and replace DATA by ADDRESS

Code
      do { 
local $/=undef;
<ADDRESS>
};

Now, why does this read the whole file? $/ is the input record separator, which is normally set to the newline sequence. If you undefine it, as I do in the block above, the readline operator <> will gobble the whole file.
So the contents of the whole file are passed into the previous line of our script:

Code
      split /_+/,

which will just split it by the long lines of underscores. /_+/ is the regular expression for one or more underscores, and that regex is used to separate the fields in the string that we just read in. The split function returns a list of all records. Since there was a blank line before the first record, that list would contain three elements, the first of which contains only whitespace characters. Since we don't want 'empty' records, the previous line filters these out:

Code
      grep !/^\s*$/,

This will only return those elements of the list that is passed in that do not only contain whitespace characters. The grep function evaluates !/^\s*$/ for each element of the list and 'greps' only those for which the expression is true. /^\s*$/ is a regex that checks if a string contains only whitespace characters. Since we want all elements for which this is not the case, we negate the regex matching result with a '!'.
Now, we want the fields in each record. This is done in the following block that is evaluated for each of our two remaining records due to the map function:

Code
      map [ 
grep !/^\s*$/,
split /(?:\r?\n)+/
],

map is quite similar to grep, only that it returns the result of the given expression for each element of the list that we feed in. The result is encapsulated in square brackets, which means we return an array reference. So, when the map is done, we will have a list of array references. But what do the referenced arrays contain? The following two lines are very similar to what I have explained above for the split and grep. Each record is now split into its lines and the empty lines are filtered using the grep function. The regex /(?:\r?\n)+/ means one or more newline sequences, the reason I used \r?\n was to support Windows (\r\n) and Unix (\n) newline sequences.

Code
            grep !/^\s*$/, 
split /(?:\r?\n)+/

The array that is returned by grep will now have 4 elements for each record, actually, the list returned by map will look like this:

Code
  ['John Doe', 'Silver Hair', 'Wednesday, September 19 at 8 PM', '(a link is here)'], 
['Mary Bonham', 'Blue House', 'Saturday, September 23 at 7:30 PM', '(a link is here)']

We have extracted all the data from our file successfully!
Now all that's left is to print that stuff out, and this is done by mapping each list element into a string and printing the list of strings returned by map:

Code
print map "$_->[3]$_->[0](/a)(br)\n". 
"$_->[3]$_->[2](/a)(br)with $_->[1](br)(br)\n",

The code may look a bit magic to a beginner, but you quickly get used to code like this. I hope the explanations above make clear to you what my script does, and what it not does. I hope you can add the neccessary code to make this suit your needs.
BTW, your other post is just the same problem. No difference! You just have to replace the expression in the map function:

Code
print map join('|', @$_)."\n",

This will join the array elements for each record by pipes, append a newline to the string and print the resulting list of strings. You could also put this into only one map and have:

Code
print map join('|', 
grep !/^\s*$/,
split /(?:\r?\n)+/
)."\n",
grep !/^\s*$/,
split /_+/,
do {
local $/=undef;
<DATA>
};

If you're not familiar with map, grep and split, I recommend you to read the manual pages to these functions with perldoc -f grep, for example.

I hope all this helps.

-- Marcus Cool


Code
s$$ab21b8d15c3d97bd6317286d$;$"=547269736;split'i',join$,,map{chr(($*+= 
($">>=1)&1?-hex:hex)+0140)}/./g;$"=chr$";s;.;\u$&;for@_[0,2];print"@_,"



BrightNail
Novice

Jul 27, 2001, 1:32 PM

Post #7 of 9 (1383 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

okay,
I'll look this over THOUROUGHLY,,,,
yes, I did try and upload it etc..., I did CGI it, but I was still getting some errors..most definetly on my part.

I appreciate your help. I am sure I can totally understand it...though the map/grep functions can get a little confusing.

thanks again,, I totally appreciate it. Your a trooper!!!!!!!!!



BrightNail
Novice

Jul 27, 2001, 3:43 PM

Post #8 of 9 (1376 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

MHX,

I do have a question

When you use MAP.....

You use it in brackets, yet when I looked at the docs, its not used like that.....

map [
grep...stuff
split...stufff
]

Also, above you use map again during your print statement. you Use a PERIOD between the two lines.
I thought eact statement must end in a " ; "..........ahh, my head hurts.....



mhx
Enthusiast / Moderator

Jul 27, 2001, 4:50 PM

Post #9 of 9 (1371 views)
Re: Reading in Contents, then Formatting out: Importan [In reply to] Can't Post

Hi again,

In Reply To
When you use MAP.....
You use it in brackets, yet when I looked at the docs, its not used like that.....

Code
map [ 
grep...stuff
split...stuff
],


You can use map and grep in two ways:

map BLOCK LIST
map EXPR, LIST

As you can tell from the comma (which is missing in your quote), I'm using the second way. My EXPR is an array reference created by the square brackets [ .. ]. I could have also written

Code
map { [ 
grep...stuff
split...stuff
] }

(this time without a comma, since I'm using the BLOCK version!), but too many curly braces hurt my eyes. See perldoc perlref for some examples on references.

In Reply To
you Use a PERIOD between the two lines.
I thought eact statement must end in a " ; "

The period is the string concatenation operator, so if you take

Code
print map join('|', 
grep !/^\s*$/,
split /(?:\r?\n)+/
)."\n",
grep !/^\s*$/,

I just append "\n" to the string returned by the join function. And I end all my statements with a ";". However, there's only one statement, but it's very long. (And since it's the last, you could even write it without the ";".) See perldoc perlof for a list of all of Perl's operators.

In Reply To
..........ahh, my head hurts.....

You should have seen me looking at the first Perl script I ever saw! I didn't understand a thing. But you'll get used to stuff like this very quickly.

Hope this helps.

-- Marcus


Code
s$$ab21b8d15c3d97bd6317286d$;$"=547269736;split'i',join$,,map{chr(($*+= 
($">>=1)&1?-hex:hex)+0140)}/./g;$"=chr$";s;.;\u$&;for@_[0,2];print"@_,"


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives