CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Duplicates in column 2 - printing number from column 1

 



regex2012
User

Apr 28, 2016, 7:27 AM

Post #1 of 2 (1168 views)
Duplicates in column 2 - printing number from column 1 Can't Post

I am trying to print a number from column 1 if column 2 contains duplicate entries. Here is an example of the data in the file:
128623,/vol/filecase
128624,/vol/filecase
128622,/vol/production
128621,/vol/education
128620,/vol/learningcenter

so for the /vol/filecase, it is a duplicate entry in the 2nd column, and I would like either of the numbers (128623 or 128624) printed out from column 1 since it is a duplicate.

Here is what I have:


Code
#!/usr/bin/perl 
use strict;
use warnings;

open my $FH2, '<', '/tmp/list_file.txt' or die "unable to open file 'file' for reading : $!";
open my $FH6, '>', '/tmp/test.txt' or die "unable to open file 'file' for reading : $!";
my %duplicates;
while (<$FH2>) {
print {$FH6} $_ if defined $duplicates{$_},;
$duplicates{$_}++;
}
close $FH6;

open my $FH3, '<', '/tmp/test.txt' or die "unable to open file 'file' for reading : $!";
while (<$FH3>) {
my @fields = split(',', $_);
local $" = ',';
print "@fields[0]\n";
}


This works only if the entries were to look like this in the file:
128623,/vol/filecase
128623,/vol/filecase

if the numbers in column 1 are different, it won't work.

I don't know how to identify the field for column 1 in the script, so that it checks for the duplicate entry in the second column and prints the number from the first column.

Anyone have ideas?


BillKSmith
Veteran

Apr 28, 2016, 8:16 AM

Post #2 of 2 (1165 views)
Re: [regex2012] Duplicates in column 2 - printing number from column 1 [In reply to] Can't Post


Code
#!/usr/bin/perl  
use strict;
use warnings;

my $example
= "128623,/vol/filecase\n"
. "128624,/vol/filecase\n"
. "128622,/vol/production\n"
. "128621,/vol/education\n"
. "128620,/vol/learningcenter\n"
;


#open my $FH2, '<', '/tmp/list_file.txt' or die "unable to open file 'file' for reading : $!";
open my $FH2, '<', \$example or die "unable to open 'example' for reading : $!";
#open my $FH6, '>', '/tmp/test.txt' or die "unable to open file 'file' for reading : $!";
my $FH6 = \*STDOUT;



my %duplicates;
while (<$FH2>) {
chomp;
my ($column_1, $column_2) = split /,/;
print {$FH6} "$column_1\n" if defined $duplicates{$column_2};
$duplicates{$column_2}++;
}
close $FH6;
close $FH2;


OUTPUT:

Code
128624


Or use the shorter version:

Code
my %duplicates = (); 
print {$FH6}
map { my ( $c1, $c2 ) = split /,/; $duplicates{$c2}++ ? "$c1\n" : () } <$FH2>;

Good Luck,
Bill

(This post was edited by BillKSmith on Apr 29, 2016, 1:24 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives