
7stud
Enthusiast
Jan 28, 2013, 5:24 PM
Post #9 of 11
(4862 views)
|
Re: [panicz] Wrapper around the UNIX find | xargs grep
[In reply to]
|
Can't Post
|
|
1) Using a grammar parser is a common way to solve problems like yours. I've been working on a solution using perl's Parse::RecDescent parser, but I find it very difficult to use, and the solution feels brittle. I've used python's PyParsing recursive descent parser, and it is much easier to use and the solutions feel more robust. 2) I think you have problems with your grammar because the bash shell drops the quotes around the strings that you want to specify as search terms, so any program will be handed a jumble of words for the search strings with no way to separate them. You can escape the quotes on the command line but that will make it very unwieldy to type the command. In addition, I don't know if your Scheme program handles it or not, but the shell expands globs(file patterns), so if you use: '*.pl or *.txt' in your command, the shell is going to feed your program something similar to this: prog1.pl prog2.pl or data1.txt data2.txt Unfortunately, I'm finding that to be problematic because perl's Parse::RecDescent doesn't back up like a regex engine. So if you try to match 'or' followed by several words, the perl parser will gladly gobble up all the remaining words in the command string and terminate. In any case, here is (an improved) start:
use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings-warn on unused rules, etc. $::RD_HINT = 1; #Give out hints to help fix problems. #$::RD_TRACE = 1; #Trace parsers' behaviour our %RESULTS; my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() use Data::Dumper; } #The array @item contains the rule name followed by #the matches for that rule, e.g.: # @item = ('clause', 'from', ['./some/dir', 'a/b ']) startrule : clause(s) clause: 'from' word(s /or/) #word(s) with the specified separator { #say Dumper(\@item); $main::RESULTS{start_dir} = $item[-1]; } | 'in' word(s /or/) { #say Dumper(\@item) $main::RESULTS{filenames} = $item[-1]; } | 'for' word(s /or/) { #say Dumper(\@item); $main::RESULTS{search_terms} = $item[-1]; } word : m{ \S* }xms END_OF_GRAMMAR my $text = q{from ./a in prog1.pl or data.txt for hello}; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->startrule($text) or die "Can't match text"; use Data::Dumper; say Dumper(\%RESULTS); --output:-- $VAR1 = { 'start_dir' => [ './a' ], 'search_terms' => [ 'hello' ], 'filenames' => [ 'prog1.pl', 'data.txt' ] }; Then you can use File::Find to recursively search the start directory(if that is what you want to do) for the given filenames and search terms. As you can see, your Scheme solution is much more elegant.
(This post was edited by 7stud on Jan 30, 2013, 4:58 PM)
|