CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
regex newbie question

 



spudfan
New User

Apr 13, 2011, 5:04 AM

Post #1 of 8 (4957 views)
regex newbie question Can't Post

Hi...

im cant sort this...
i want to write at regex that maches allowed filenames

find . | perl -ne 'print if /[^A-Za-z0-9_+-.]\n/'

for example the above
mathes (filename
but not file(name

what is missing ?


BillKSmith
Veteran

Apr 13, 2011, 11:23 AM

Post #2 of 8 (4944 views)
Re: [spudfan] regex newbie question [In reply to] Can't Post

Your most serious error is that your regular expression is trying to match the contents of the scalar variable $_ which is not initialized in your script.

Your RE looks for an invalid filename character followed by a newline. Your input consists of a list (@ARGV) of the files from your current directory. Your operating system would not have allowed any illegal characters in any of their names.

You probably want to test all of filenames. This requires some kind of a loop.

I recommend that you do not use perl one-liners until you are more comfortable with perl. In your perl scripts, you should always use 'use strict;' and 'use warnings'. Never use the hidden default variable $_.
Good Luck,
Bill


Zhris
Enthusiast

Apr 13, 2011, 1:00 PM

Post #3 of 8 (4943 views)
Re: [BillKSmith] regex newbie question [In reply to] Can't Post

Out of interest Bill, you say to never use the hidden default variable $_. I use it alot to keep my code shorter, and more "slick". It only ever becomes an issue for me when using nested loops. In theory, it increases efficiency/compilation time, no matter how little. From a beginners point of view, if they don't have a clear understanding of $_, then they are likely to bump into problems they can't make sense of. Also its better practice to use a named variable. I totally see the advantage of not using $_, but is there another important reason why that I should know about?

Chris


BillKSmith
Veteran

Apr 14, 2011, 3:50 AM

Post #4 of 8 (4914 views)
Re: [Zhris] regex newbie question [In reply to] Can't Post

Chris,

I do the same thing myself, for the same reasons. I was making a recommendation to a new perl programmer. It should prevent the type of error that he had made.

BTY, the book "Perl Best Practices" makes the same recommendation to all of us, but for a different reason. A variable named by his conventions provides useful documentation. It will also prevent a future maintenance programmer from breaking your code by assuming that some built-in such as printf also defaults to $_.
Good Luck,
Bill


Zhris
Enthusiast

Apr 14, 2011, 5:51 AM

Post #5 of 8 (4905 views)
Re: [BillKSmith] regex newbie question [In reply to] Can't Post

Thanks Bill,

I never thought about designing code that can be easier maintained by another person in the future, i've never worked as part of a team, or handed down any code.

I keep hearing about the book "Perl Best Practices", I will have to get a copy at some point.

Chris


spudfan
New User

Apr 14, 2011, 6:46 AM

Post #6 of 8 (4900 views)
Re: [BillKSmith] regex newbie question [In reply to] Can't Post

ok the oneliner was just an example
the accutal question is about the RE
/[^A-Za-z0-9_+-.]\n/

as you said it looks for an invalid filename character followed by a newline, what i want is..

match [^A-Za-z0-9_+-.(+whitespace)] but not newline....


BillKSmith
Veteran

Apr 14, 2011, 8:08 AM

Post #7 of 8 (4890 views)
Re: [spudfan] regex newbie question [In reply to] Can't Post

Use the predefined escape sequences for all the other whitespace characters. Unfortunately, there is no code for a space character. You must use an actual space.

/[^\w+-. \t\r\f]/

This will match any character except a letter, a number, an underscore, a plus sign, a minus sign, a period, a space, a tab, a return, or a formfeed. (Think of it as matching any puctuation character or a newline) Note: You may have to remove the \r on some operatiing systems.
Good Luck,
Bill


spudfan
New User

Apr 14, 2011, 1:48 PM

Post #8 of 8 (4853 views)
Re: [BillKSmith] regex newbie question [In reply to] Can't Post

ok thanks /[^\w+-.\/\n]/ did the trick....

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives