CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Comparing text areas

 



jambo1986
Novice

Nov 2, 2006, 10:15 AM

Post #1 of 6 (613 views)
Comparing text areas Can't Post

Hi all i am a very new here. I have looked around the forum and it looks really good done lots of searches to make sure there wasnt any posts answering my questions. There were some but none exactly what im looking for.

Basically what i have is a HTML page where i have two text areas among other things and when i hit submit i want a perl cgi script to do a number of things.

First and most importantly i want to compare them line by line and then print out all the lines which are identical (ignoring case and number of spaces etc...)

Originally i had lots of ideas putting them into arrays, spliting/joining them into one big strin etc but just cant get it working what so ever. I found somethin i though i could help on here where i put the 2 text into files using file handles and then convereted them into arrays and used the below code:


my $otext = param ("original"); #input from my text areas
my $ptext = param ("plag");

open (NEW, ">plag.txt"); #sending the text so files
print NEW $ptext;
close NEW;

open (EXISTING, ">original2.txt");
print EXISTING $otext;
close EXISTING;

open (NEW, "<plag.txt") || die "Unable to open plag.txt\n $! \n";
my @new = <NEW>;
close NEW;
print "Content-type: text/html\n\n";


open (EXISTING, "<original2.txt") || die "Unable to open original2.txt\n $!\n";
my @existing = <EXISTING>;
close EXISTING;



@union = @intersection = @difference = ();
%count = ();
foreach $element (@new, @existing ) { $count{$element}++ }
foreach $element (keys %count) {
push @union, $element;
push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
} print "Intersection:<br>\n";
print "lines in common";
print "$_<br>\n" for @intersection;
print "<br>Difference:<br>\n";
print "$_<br>\n" for @difference;


The can give me what line are in common however doesnt ignore case or spacing. To be honest i dont want to do it this way. Ideally i would just like to use strings/arrays.

I hope this makes some sence and tbh you would be best ignoring my code this was a final desparate attemp and isnt really doing what i want!

Thanks.


(This post was edited by jambo1986 on Nov 2, 2006, 10:17 AM)


KevinR
Veteran


Nov 2, 2006, 11:06 AM

Post #2 of 6 (610 views)
Re: [jambo1986] Comparing text areas [In reply to] Can't Post

There is the List::Compare module that has many list comparison methods.

But maybe this is all you need or want:


Code
my $otext = param('otext'); 
my $ptext = param('ptext');

my @existing = map {s/\s+//g; lc($_)} split(/\r?\n/,$otext);
my @new = map {s/\s+//g; lc($_)} split(/\r?\n/,$ptext);

@union = @intersection = @difference = ();
%count = ();
foreach $element (@new, @existing ) { $count{$element}++ }
foreach $element (keys %count) {
push @union, $element;
push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
} print "Intersection:<br>\n";
print "lines in common<br>\n";
print "$_<br>\n" for @intersection;
print "<br>\n";
print "Difference:<br>\n";
print "$_<br>\n" for @difference;

-------------------------------------------------


jambo1986
Novice

Nov 2, 2006, 11:19 AM

Post #3 of 6 (608 views)
Re: [KevinR] Comparing text areas [In reply to] Can't Post

Hi Kevin,

Thank you very much for the fast reply it is much appreciated!

Im very new to perl ive done other languages like Java to some extent but wanted to try my hand at something a little different and was bit of a toss up between perl and php so thought id have a bit of a play about with perl. Think ive got the hang of the basics jsut trying the regular expressions.

Next using the same to texts i want to see if i can compare the space and punct distribution:

e.g

original text

a dog ran away, but his owner found him!

new text

this is differnt but, has same spacing and puncuation!

So all the words are differnt but the spacing and puncuation are still in the same from the original text. Not sure if doing this is possible? but im finding it all quite interesting but is really quite challenging done lots of practice pages but difficult to teach your self unless you set yoursell exercise to do.

Thanks.


KevinR
Veteran


Nov 2, 2006, 3:13 PM

Post #4 of 6 (605 views)
Re: [jambo1986] Comparing text areas [In reply to] Can't Post

There are quite a lot of modules for examining text and running all sorts of analysis of the text and spitting out results. I'm not as familiar with those modules as I would like to be. But for simple sentences like you posted simple code should work:


Code
my $var1 = 'a dog ran away, but his owner found him!'; 
my $var2 = 'this is differnt but, has same spacing and puncuation!';
my ($r1,$r2) = convert($var1,$var2);
print qq~Comparing:<br>
\$var1 - '$var1'<br>
\$var2 - '$var2'<br><br>
~;

print "Results:<br>\n";
if ($r1 eq $r2) {
print "Space and punctuation distribution appears the same.";

}
else {
print "Space and punctuation distribution appears different.";
}
print qq~<br>
\$var1 - $r1<br>
\$var2 - $r2<br>
<p>Legend:<br>
W - word<br>
S - space<br>
P- punctuation
</p>~;

sub convert {
my ($var1,$var2) = @_;
for($var1,$var2) {
s/\w+/W/g;
s/\s+/S/g;
s/[^WS]/P/g;
}
return($var1,$var2);
}

-------------------------------------------------


jambo1986
Novice

Nov 4, 2006, 4:33 PM

Post #5 of 6 (590 views)
Re: [jambo1986] Comparing text areas [In reply to] Can't Post

Hi Kevin,

Thanks for all your help. After some testing i was wondering if you could give me some further advice regards to the orginal question of comparing the text areas. a few problems if a user doesnt hit return at any point in the text area and just types and lets it take a new line itself the results of identical text will be printed as one long string e.g.

becausemaintainingacentrallistofdomainname/ipaddresscorrespondenceswouldbeimpractical,thelistsofdomainnamesandipaddressesaredistributedthroughouttheinternetinahierarchyofauthority.thereisprobablyadnsserverwithinclosegeographicproximitytoyouraccessproviderthatmapsthedomainnamesinyourinternetrequestsorforwardsthemtootherserversintheinternet.


Is there a way for it to be printed out the same way as it apears in the text area in this case it is 80 cols. And the next question is once the compare has been done ignoring case and spacing how can i put it so that the lines printed out include it so it can make sence when reading it.

thanks.


KevinR
Veteran


Nov 4, 2006, 5:08 PM

Post #6 of 6 (589 views)
Re: [jambo1986] Comparing text areas [In reply to] Can't Post

You can use the Text::Wrap
module to wrap text. Or you could use substr() or a regexp to break long strings into columns. If you want to print the text as it was recieved from the textarea save a copy of the text in it's original condition and use that copy to print and the other copy to run the comparisons.
-------------------------------------------------

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives