CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Editing an nth match

 



ad65
Deleted

May 15, 2001, 6:57 PM

Post #1 of 13 (3894 views)
Editing an nth match Can't Post

Can anyone tell me how to do this, i've tried everything:

I want to open a file, look for the 11th matching 'mailto:'
and delete the whole line it's on.

Surely there is a way?



mhx
Enthusiast / Moderator

May 15, 2001, 9:55 PM

Post #2 of 13 (3891 views)
Re: Editing an nth match [In reply to] Can't Post

I guess the following code should work.


Code
my $count = 0; 
while( <> ) {
next if /^\s*mailto:/ && ++$count == 11;
print;
}

It assumes that your file has not more than one mailto: per line and nothing but whitespaces before the mailto:. Blank lines are allowed. The code reads from STDIN and writes to STDOUT. The eleventh line that starts with a mailto: will be discarded.
If this is not what you need, try be be more precise in your description of what you need.

-- Marcus



Jasmine
Administrator / Moderator

May 15, 2001, 9:55 PM

Post #3 of 13 (3891 views)
Re: Editing an nth match [In reply to] Can't Post

Check out this post in the FAQ: How do I change the Nth occurrence of something?



ad65
Deleted

May 16, 2001, 5:32 AM

Post #4 of 13 (3886 views)
Re: Editing an nth match [In reply to] Can't Post

Nope, this didn't work, and Jasmine's FAQ doesn't have the answer.
I'm using a script similar to that of a guestbook to update a page with news. I only want a maximum of 10 news results, and each time one is added it goes to the top, so i need to delete one from the bottom. Also, I need it to first check if there is an 11th occurence of mailto:, and if so, delete that line. the mailto: is not at the start at the line however, although I could put at the start if it's needed.



ad65
Deleted

May 16, 2001, 11:59 AM

Post #5 of 13 (3875 views)
Re: Editing an nth match [In reply to] Can't Post

Someone? Please?



ad65
Deleted

May 16, 2001, 8:55 PM

Post #6 of 13 (3862 views)
Re: Editing an nth match [In reply to] Can't Post

PLEASE! Someone must know how to do this...i've searched every bloody tutorial and FAQ available and it doesn't tell me how to do it, although there must be a way



mhx
Enthusiast / Moderator

May 16, 2001, 10:58 PM

Post #7 of 13 (3859 views)
Re: Editing an nth match [In reply to] Can't Post

I've checked the posted script on a dummy file -- it works perfectly, just as I expected.
Could you please give a more detailed description of what the file looks like? (Could you post the file or put it anywhere for download?) It is essential to have some information on the data if you want to parse it.
What I'd mainly like to know:
- Are there any lines without 'mailto:' in the file?
- If there are, what's the purpose of these lines?

-- Marcus



Mortimer
journeyman

May 16, 2001, 11:39 PM

Post #8 of 13 (3855 views)
Re: Editing an nth match [In reply to] Can't Post

>I only want a maximum of
>10 news results, and each
>time one is added it goes
>to the top, so i need to delete
>one from the bottom.

This will add a new line to the top and delete the last...


Code
my $data_file = "c:/path/to/your/file.txt"; 
my $new_line = "new_line1\n";
my @lines;
open(FILE,"+<$data_file")or die("Unable to open $data_file: $!");
flock(FILE,2);
while(<FILE>){
push(@lines,$_);
}
unshift(@lines,$new_line);
pop(@lines);
seek(FILE,0,0);
truncate(FILE,0);
print FILE @lines;
close(FILE);

There must be a couple of million other ways to do this too.

Sorry, I'm not too clear on what you want to do with the MAILTO string.
Hope this helps.

Dave.



ad65
Deleted

May 17, 2001, 7:58 AM

Post #9 of 13 (3843 views)
Re: Editing an nth match [In reply to] Can't Post

Ah, damn I didn't explain well enough. I appreciate that but I don't think it works. There are other lines in file, it's a html file. In order to match a pattern and recognise all the lines that are the news updates, i've put a tag , IF there is one, and delete the whole line that it's on.
4. Ignore all other lines, in other words, I don't mess with the rest of the html code.
5. Close the file.

Please, please please if you, or anyone else knows how to do this, or wants to have a go at it, then please tell me. Like I said, i've looked everywhere, this is a subject very hard to find information on.



Mortimer
journeyman

May 17, 2001, 2:30 PM

Post #10 of 13 (3834 views)
Re: Editing an nth match [In reply to] Can't Post

I don't think you're going to get anybody to have a go at this without taking the advice that mhx gave you in his last post. Show your data.
I believe mhx when he says his code works. He's tested it.
Guessing around can soon become frustrating even when it's for one's self, but here's one last suggestion.
Are you trying to parse the file or the page that this html file outputs? Perhaps you're treating a < br > as a \n. Just another stab in the dark. Please don't feel insulted if I'm wrong.

Cheers,
Dave.



ad65
Deleted

May 17, 2001, 6:46 PM

Post #11 of 13 (3826 views)
Re: Editing an nth match [In reply to] Can't Post

Yeah, I believe his code works too, the point being I think he got the wrong idea about what exactly i'm trying to do, which was my fault for not explaining well enough. Here's the code:

#
### THE FOLLOWING UPDATES THE INDEX PAGE WITH THE LATEST ENTRY
#

sub update_index {

open(MAIN, "$index_file") || die "Can't open MAIN: $index_file\n";
@main = <MAIN>;
close(MAIN);

open(MAIN, ">$index_file") || die "Can't open MAIN: $index_file\n";
foreach $line (@main) {
if ($line =~ /<TR><TD BGCOLOR=DFDFDF WIDTH=50%><FONT FACE=TAHOMA SIZE=2 COLOR=aa0000><A HREF=\"$filename\">$news</A></FONT></TD><TD ALIGN=CENTER BGCOLOR=DFDFDF><FONT FACE=TAHOMA SIZE=2 COLOR=aa0000><A HREF=\"mailto:$man\@soandsosite.com\">$man</A></FONT></TD><TD ALIGN=CENTER BGCOLOR=DFDFDF><FONT FACE=TAHOMA SIZE=2>$mon[$Month] $EntryDate</FONT></TD></TR>\n";
}

else {
print MAIN "$line";
}

}

close(MAIN);
&alphabet;

}

The rest goes on to create other pages and stuff, i'm not saying the script doesn't work, just I simply have no idea how to do what i've described in my other posts. Despite many attempts. And there's nothing wrong with the output of the page, I don't need to parse it, but thanx for the guess.

I don't think my previous post came out right, again, heres what I need to know how to do:

Open the file MAIN. Find all lines that begin with !--mailto:--. Match the 11th line that begins with !--mailto:--, IF there is one, and delete it. Ignore all the other lines, not changing any of them like. Close the file.

Alternatively if it's possible to delete all occurences after the 10th, of lines that begin with !--mailto-- then that would be fine too. Maybe this is too hard a question for a beginner's forum, come and have a go if you think ur ard enough. lol. No but really, please do help if you can!

(This post was edited by ad65 on May 17, 2001, 5:56 PM)


mhx
Enthusiast / Moderator

May 17, 2001, 10:51 PM

Post #12 of 13 (3819 views)
Re: Editing an nth match [In reply to] Can't Post

I surely hope this is going to be the last post in this thread. Here's the solution to your problem. (I'm still guessing what your mysterious 'MAIN' file looks like. Why don't you just post an excerpt? Is it so top secret [anyone can view the source with a browser] or is it a candidate for the next 'obfuscated HTML contest'?)
Anyway, this code does work:


Code
#!/bin/perl -w 
use strict;

my $index_file = 'yyy';

open MAIN, $index_file or die "can't open `$index_file': $!\n";
my @main = <MAIN>;
close MAIN;

open MAIN, ">$index_file" or die "can't open `$index_file': $!\n";
my $count = 0;
print MAIN grep { !/^!--mailto:--/ || $count++ < 10 } @main;
close MAIN;

Here's the proof. I've run it over the following file, which should look mostly like your main file:


Code
<some> <dummy> <code> 

!--mailto:-- (1)
<some> <dummy> <code>
!--mailto:-- (2)
!--mailto:-- (3)
<some> <dummy> <code>

<some> <dummy> <code>
!--mailto:-- (4)
<some> <dummy> <code>
!--mailto:-- (5)
!--mailto:-- (6)
<some> <dummy> <code>

!--mailto:-- (7)
!--mailto:-- (8)
<some> <dummy> <code>
!--mailto:-- (9)
!--mailto:-- (10)


!--mailto:-- (11)

<no> !--mailto:-- (keep this)
!--mailto:-- (12)

<end> <of> <file>

Running a diff over the original and the modified file shows that only the 11th and 12th lines beginning with a !--mailto:-- have been taken out:


Code
mholland@bmdke3 $ diff yyy.orig yyy 
23d22
< !--mailto:-- (11)
26d24
< !--mailto:-- (12)

-- Marcus



ad65
Deleted

May 18, 2001, 8:58 AM

Post #13 of 13 (3805 views)
Re: Editing an nth match [In reply to] Can't Post

That worked brill, thanx and sorry to be a nuisance.


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives