CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
extract data...

 

First page Previous page 1 2 Next page Last page  View All


AWHF
Novice

Jun 25, 2007, 6:34 PM

Post #26 of 44 (3577 views)
Re: [KevinR] extract data... [In reply to] Can't Post

sorry!
what i'm trying to mean that is i want to open all the files in a directory which the name ended by .doc.
let's say the file called "file1.doc", "file2.doc", "file3.doc", .... and the directory called "document". i had try out this script but it could not function. there is nothing in the "result"

Code
  #!/usr/bin/perl   
my $dirname = "/eng/png/home/awhf/document";
chdir ($dirname) || die "can't chdir to $dirname: $!";
opendir(DIR, $dirname) || die "can't open directory: $!";
my @files = grep {/\.doc$/} readdir DIR;
close DIR;
open(OUT, ">>/eng/png/home/awhf/result") || die "can't open: $!";
while (<@files>)
{
s/\s*(\w+).*/$1/;
if (/^::start/../^end/)
{
print OUT;
}
}
close OUT



(This post was edited by AWHF on Jun 25, 2007, 6:40 PM)


KevinR
Veteran


Jun 25, 2007, 8:18 PM

Post #27 of 44 (3569 views)
Re: [AWHF] extract data... [In reply to] Can't Post

I thought I already showed you how to do this earlier in the thread. If you want to read the file directly from the @files array you have to explicitly open them:


Code
#!/usr/bin/perl    
my $dirname = "/eng/png/home/awhf/document";
chdir ($dirname) || die "can't chdir to $dirname: $!";
opendir(DIR, $dirname) || die "can't open directory: $!";
my @files = grep {/\.doc$/} readdir DIR;
close DIR;
open(OUT, ">>/eng/png/home/awhf/result") || die "can't open: $!";
foreach my $file (@files) {
open (my $FH, $file) or die "$!";
while (<$FH>)

{
s/\s*(\w+).*/$1/;
if (/^::start/../^end/)
{
print OUT;
}
}
}


I don't know if thats going to print anything to OUT but thats the way to cycle through and read the files in @files.
-------------------------------------------------


AWHF
Novice

Jun 29, 2007, 12:27 AM

Post #28 of 44 (3558 views)
Re: [KevinR] extract data... [In reply to] Can't Post


As what i had try the code, it still can't function. There is an error. The info in the "result" is full with numbers instead a the info that is in the .doc files.


Code
  #!/usr/bin/perl        
use strict;
use warnings;
my $dirname = "/eng/png/home/awhf/document";
chdir ($dirname) or die "Can't chdir to $dirname: $!";
opendir(DIR, '.') or die "Cannot open directory: $!";
my @files = grep {/\.cas$/} readdir DIR;
close DIR;

open (OUT,">>/eng/png/home/awhf/results") or die "Can't open results: $!";
{
local @ARGV = @files;
local $/ = undef;
while(<>) {
if(/::START(.*?)::END/is) {
my $match = $1;
$match =~ s/(\w+)/$1/gm;

print OUT $match;
}
}
}
close OUT;



the info in the "result" is as follow:
123543678432435476452454769875735
123543579879067563427346582134490
432135468424130978967581341341657

instead of:
Ali
Zarul

Alec

Caster








As what we discuss previously, the are error at the output. As i notice that the coding in the red color have some error. How am i gonna correct it?

The example files i work on are as follow:

File1.doc

::Start
Ali Baba
Zarul
Bahrul
Alec
Foong
Caster
Lee
::End

File2.doc

::Start
Calvin Lee
Ester Foong
Alex Wong

Lee Kim
::End

File3.doc

::Start
Jocob Lim
Irene Chai
::End



KevinR
Veteran


Jun 29, 2007, 12:56 AM

Post #29 of 44 (3552 views)
Re: [AWHF] extract data... [In reply to] Can't Post


Quote
The info in the "result" is full with numbers instead a the info that is in the .doc files.


I don't know why you are getting those numbers. when I run the code with the sample data I get the correct results.
-------------------------------------------------


AWHF
Novice

Jun 29, 2007, 1:12 AM

Post #30 of 44 (3551 views)
Re: [KevinR] extract data... [In reply to] Can't Post

is there another code the replace the code in red color?


KevinR
Veteran


Jun 29, 2007, 10:36 AM

Post #31 of 44 (3541 views)
Re: [AWHF] extract data... [In reply to] Can't Post

There are probably a few ways to do what you are trying.
-------------------------------------------------


AWHF
Novice

Jul 1, 2007, 5:11 PM

Post #32 of 44 (3528 views)
Post deleted by AWHF [In reply to]

 


KevinR
Veteran


Jul 1, 2007, 8:31 PM

Post #33 of 44 (3525 views)
Re: [AWHF] extract data... [In reply to] Can't Post

I might help you, but I will not do it for you.
-------------------------------------------------


AWHF
Novice

Jul 2, 2007, 1:42 AM

Post #34 of 44 (3521 views)
Re: [KevinR] extract data... [In reply to] Can't Post

Actually after the previous discussion, i need to do an assignment. My assignment are as follow:- I have 2 directories, "DIR_A" & "DIR_B".
In the "DIR_A" i have many file which the name end by ".doc1" (e.g: file1.doc1, file2.doc1, ...).
all the content in the files as follow(example):



file1.doc1
::Start
ali Baba
zarul Bahrul
alec Foong
caster Lee
::End

file2.doc1
::Start
Calvin Lee
Ester Foong
Alex Wong
Lee Kim
::End

file3.doc1
::Start
jocob Lim
irene Chai
::End


where else in "DIR_B" i have many files as well which the name end by "_fast.txt" and "_slow.txt" all the content in the files as follow(example):

file1_fast.txt
this is the 1st line
this is the 2nd line
name(ali)
this is the 3rd line
name(zarul)
this is the 4th line
name(alec)
this is the 5th line
this is the 6th line

file2_fast.txt
this is the 1st line
this is the 2nd line
name(calvin)
this is the 3rd line
name(alex)
this is the 4rd line

file3_fast.txt

this is the 1st line
this is the 2nd line
name(jacob)
this is the 3rd line
this is the 4th line
name(irene)
this is the 5th line
this is the 6th line


file1_slow.txt
this is 1st sentence
name(ali)
this is 2st sentence
name(zarul)
this is 3st sentence
name(alec)
this is 4st sentence


file2_slow.txt
this is 1st sentence
this is 2st sentence
name(calvin)
this is 3st sentence
name(alex)
this is 4st sentence



file3_slow.txt

this is 1st sentence
name(jacob)
this is 2st sentence
name(irene)
this is 3st sentence

where there are the same name between "_fast.txt" and "_slow.txt"
as u can see above "file1_fast.txt" = "file1_slow.txt"
"file2_fast.txt" = "file2_slow.txt"
"file3_fast.txt" = "file3_slow.txt"
the content between the "_fast.txt" and "_slow.txt" are not the same but the info that need to extract is same.









All the info are in blue color. This is my home directory (/eng/home/assignment) where the "DIR_A" and " DIR_B" are located.
I need to compare the info in all the files between "DIR_A" and "DIR_B". The comparison need to make by the same file name (e.g: file1.doc with file1_fast.txt with file1_slow.txt)

the output of the comparison will look like this:
file1
caster

file2
ester
lee

file3
none

It only show the different of the comparison




so what is the concept, methodology or how the flow chart will looks like?


KevinR
Veteran


Jul 2, 2007, 9:08 AM

Post #35 of 44 (3514 views)
Re: [AWHF] extract data... [In reply to] Can't Post

hint: use a hash
-------------------------------------------------


AWHF
Novice

Jul 2, 2007, 5:11 PM

Post #36 of 44 (3509 views)
Re: [KevinR] extract data... [In reply to] Can't Post

DIR_A
1) Go in to DIR_A
2) Sort files to make all the files between ".doc", "_fast.txt" and "_slow.txt" the same. (e.g: file1.doc = file1_fast.txt = file1_slow.txt)
3) Open the files one by one
4) Extract info
5) Sort the info
6) compare
7) Print out the result (different & same)

DIR_A -> open DIR_A -> sort files -> open the same name file -> extract info -> sort info -> compare -> print result

DIR_B
1) Go in to DIR_B
2) choose the same type of file (e.g: _fast.txt and _slow.txt)
3) Sort files to make all the files between ".doc", "_fast.txt" and "_slow.txt" the same. (e.g: file1.doc = file1_fast.txt = file1_slow.txt)
3) Open the files one by one
4) Extract info
5) Sort the info
6) compare
7) Print out the result (different & same)

DIR_B -> open DIR_B -> choose type of file -> sort files -> open the same name file -> extract info -> sort info -> compare -> print result



Is my concept of the whole script correct?


(This post was edited by AWHF on Jul 2, 2007, 7:27 PM)


KevinR
Veteran


Jul 2, 2007, 8:35 PM

Post #37 of 44 (3504 views)
Re: [AWHF] extract data... [In reply to] Can't Post

sounds good
-------------------------------------------------


AWHF
Novice

Jul 2, 2007, 8:55 PM

Post #38 of 44 (3503 views)
Re: [KevinR] extract data... [In reply to] Can't Post

so, this is layout of my script:



open DIR_A
sort files
{
open the same name file
{
extract info
{
sort info
}
}
}
Output A


open DIR_B
{
choose type of file (e.g: whether is _fast.txt or _slow.txt)
{
sort files
{
open the same name file
{
extract info
{
sort info
}
}
}
}
}
Output B
Compare Output A with Output B
{
print result
}





what do u think?
was it loop correctly?



(This post was edited by AWHF on Jul 2, 2007, 10:21 PM)


KevinR
Veteran


Jul 2, 2007, 9:32 PM

Post #39 of 44 (3501 views)
Re: [AWHF] extract data... [In reply to] Can't Post

Start writing real code and test as you go. You will get quick feedback about your programming logic that way. Wink
-------------------------------------------------


AWHF
Novice

Jul 3, 2007, 2:18 AM

Post #40 of 44 (3495 views)
Re: [KevinR] extract data... [In reply to] Can't Post


Code
#!/usr/bin/perl 
my $dir_A = "/eng/home/assignment/dir_A";
chdir($dir_A) || die "Can't chdir to $dirname: $!";
opendir(DIRA, $dir_A) || die "Can't open directory: $1";
my @docfiles = grep{/\.doc$/} readdir DIRA;
close DIRA;
@sortdoc = (sort @docfiles);open(FILE, $sortdoc);
while (<FILE>)
{
s/(\w+)/$1/;
if (/^::start/../^::end/)
{
push (@array, $1);
}
@sortdoc = (sort, @array);
}
my $dir_B = "/eng/home/assignment/dir_B";
chdir($dir_B) || die "Can't chdir to $dirname: $!";
opendir(DIRB, $dir_B) || die "Can't open directory: $1";
my @txtfilesfast = grep{/\_fast.txt$/} readdir DIRB;
my @txtfilesslow = grep{/\_slow.txt$/} readdir DIRB;
close DIRB;
@sortfiletxtfast = (sort @txtfilesfast);

open (FILE1, $txtfilesfast);
while ($line1 =<FILE1>)
{
if($line1 =~/name\((\w+\)/)
{
push (@array1, $line1);
}
@sorttxtfast = (sort, @array1);
}open (FILE2, $txtfilesslow);
while ($line2 =<FILE2>)
{
if($line2 =~/name\((\w+\)/)
{
push (@array2, $line2);
}
@sorttxtslow = (sort, @array2);
}


this is part of my script, but i haven't test because i don't know how to continue my script to do the comparison. can u help me? btw...is my script correct?


KevinR
Veteran


Jul 3, 2007, 10:19 AM

Post #41 of 44 (3486 views)
Re: [AWHF] extract data... [In reply to] Can't Post

How come you are not asking your teacher/professor/tutor these questions?
-------------------------------------------------


AWHF
Novice

Jul 3, 2007, 4:34 PM

Post #42 of 44 (3484 views)
Re: [KevinR] extract data... [In reply to] Can't Post

Actually i learn this PERL by my self. Non of my friends know PERL. I feel that PERL is very usefull espeacially for extraction process. So, i hope this forum will help me. And i hope u will guide me thru my learning process.Smile


KevinR
Veteran


Jul 3, 2007, 4:53 PM

Post #43 of 44 (3482 views)
Re: [AWHF] extract data... [In reply to] Can't Post

OK, well, I'm really not a very good teacher. I am a self taught perl coder myself and my education has some large gaps in it but I'll try and assist you.
-------------------------------------------------


AWHF
Novice

Jul 3, 2007, 4:57 PM

Post #44 of 44 (3481 views)
Re: [KevinR] extract data... [In reply to] Can't Post

Ok!Wink
So, i'm looking forward for your help for me to complete my 1st script.

First page Previous page 1 2 Next page Last page  View All
 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives