
BillKSmith
Veteran
Aug 28, 2012, 6:04 AM
Post #10 of 10
(21252 views)
|
Re: [Chupo_cro] Regexp to remove a matched pattern but...
[In reply to]
|
Can't Post
|
|
I am sure that Laurent is on the right track. This will take more than one Regex. I recommend that you implement the solution as a subroutine. For testing purposes, it may worth the extra effort to package that subroutine as a module. The subroutine should be tested with one of perl's test modules. http://search.cpan.org/~rgarcia/perl-5.10.0/lib/Test/Tutorial.pod. Your specification is already in exactly the form that you need. EDITS: I have replaced the code in this post. The failures reported by the original were correct. I have improved the output of the test and replaced the regular expressions with a new set. The new subroutine now passes all but the blue test case. The test case appears to have too many spaces between 'one' and 'two'. I have taken the liberty of changing that in my test. The code now posted below passes all tests (including the extra one I describe below)
use strict; use warnings; use Test::More qw( no_plan ); my %fixed_line = ( 'one two some_pattern three four' => 'one two three four', 'one two some_pattern three four' => 'one two three four', 'one twosome_pattern three four' => 'one two three four', 'one two some_patternthree four' => 'one two three four', 'one two three four some_pattern' => 'one two three four', "one two three four some_pattern\n" => "one two three four\n", 'some_pattern one two three four' => 'one two three four', 'onetwosome_patternthreefour' => 'onetwothreefour', 'one two some_pattern three four' => 'one two three four', ); foreach my $line (keys %fixed_line) { my $expected = $fixed_line{$line}; my $computed = fix($line); is( $computed, $expected, "'$line' => '$computed'" ); } sub fix { my ($line) = @_; $line =~ s/(?:^\s*some_pattern\s*)|(?:\s*some_pattern$)//; $line =~ s/(?<=\S)some_pattern(?=\S)//; $line =~ s/\s*some_pattern\s*/ /; return $line; } There are other issues which should be tested. Regexps treat Tabs and Newlines as whitespace. That may not be what you want. (e.g. your fifth case would be much different there were a newline at the end.) I suspect that you will continue to discover special cases for some time. This kind of test will assure you that proposed fixes do not break the old code. Good Luck, Bill
(This post was edited by BillKSmith on Aug 28, 2012, 9:25 AM)
|