CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Remove any amount of char from front and back of a string in a file

 



tc
New User

Jun 7, 2018, 4:35 PM

Post #1 of 10 (175 views)
Remove any amount of char from front and back of a string in a file Can't Post

I learn Assembler for Windows by examples back in the 90. These days I am interested in high-level languages for Linux and BSD. PERL is my first (5 days strong), and I like it. I google PERL all day, everyday. I learn a lot, yet I found no solution for my code. Above all, various PERL gurus more then suggest that we should use strict-mode and I refuse to settle for less. According to my example, it should be self-explanatory. I don’t care what it takes to make it work (I can’t figure out the other ways, anyway), yet I would really like to know what chomp is missing. It blowing my mind, all day and night. If you place it in a separate file it will work without strict, but that is not what I am after. I have not even got to the front-end removal of unwanted characters because of dealing with the end.


Code
#!/usr/local/bin/perl 
# use strict; # strict-mode is prefered
# use warnings;
# ........................ perl /workspace/GetString.pl
# ........................
my $find = "tbody";
my $back = "tbody";
# ........................
# ........................
# --- open needed files
open (SUB, ">", "/workspace/1-empty.txt" ) or die "could not open:$!";
open (BACK, ">", "/workspace/2-empty.txt" ) or die "could not open:$!";
open (FRONT, ">", "/workspace/3-empty.txt" ) or die "could not open:$!";

open (MY_FILE, "<", "/workspace/html.txt") or die "could not open:$!";
# .......................................................
# ....................................................... Strip/save full tag
while (<MY_FILE>) {
print SUB if (/$find/); }
# ....................................................... Issues with strict.
# ....................................................... Works without it.
chomp($_str1 = <MY_FILE>);
$back = substr($_str1, 0, -126);
close (MY_FILE);
print MY_FILE $back;

# .......................................................
# ....................................................... Clean up
close (MY_FILE);
close (SUB);
close (BACK);
close (FRONT);


# These are the ERRORS I get:
# (~)s1$ perl /workspace/GetString.pl
# Global symbol "$_str1" requires explicit package name at /workspace/GetString.pl line 32.
# Global symbol "$_str1" requires explicit package name at /workspace/GetString.pl line 33.
# Execution of /workspace/GetString.pl aborted due to compilation errors.
# (~)s1$



html.txt for example:
<skip this --- !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict">
<meta name=" skip this to ">
<tbody id="target"><tr><td>remove this...Return only this sentence. ...remove-the-rest</tbody>
<p id="gone."</p>
<p id="gone."</p>
<p id="gone!"</p>
.


(This post was edited by FishMonger on Jun 10, 2018, 8:28 AM)


Chris Charley
User

Jun 7, 2018, 6:57 PM

Post #2 of 10 (170 views)
Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

I hardly work with HTML and it seems you are doing so here. So I'm sorry, but I can't recommend a good parser.

Some things I noticed in your code.

print SUB if (/$find/); should be print SUB $_ if (/$find/);

chomp($_str1 = <MY_FILE>); should be chomp(my $_str1 = <MY_FILE>)

You open MY_FILE for reading and later try to (unsuccessfully) write to it print MY_FILE $back;

The errors you are getting are 'strict' telling you that $_str1 hasn't been declared with 'my'.


(This post was edited by Chris Charley on Jun 7, 2018, 8:14 PM)


BillKSmith
Veteran

Jun 7, 2018, 9:03 PM

Post #3 of 10 (160 views)
Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

The pragma 'strict' disallows three things which are otherwise allowed.

  • Undeclared Variables

  • Symbolic References

  • Bare words


  • Removing strict to eliminate error messages is ALWAYS a bad idea.


    I will not describe symbolic reference because it is almost always a bad idea. You are unlikely to try to use it by accident. Strict tells a reader of your code that you definitely are not using it.

    Without strict, undeclared variables are considered 'global'. With strict, all variables must be declared as either global (with our) or as lexical (with my).

    You will be amazed at how many spelling and typing errors this catches for us. Good practice dictates that we declare all variables with 'my', unless we have a very good reason to use 'our'. We can prevent a number of common errors by declaring all our variables in the smallest possible scope. Strict does not require this, but it does help to find missing or misplaced my's.


    Error messages about bare words are almost always telling us that we forgot the sigel ($, @, or %) before a variable name.
    Good Luck,
    Bill


    Laurent_R
    Veteran / Moderator

    Jun 7, 2018, 11:26 PM

    Post #4 of 10 (159 views)
    Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    Chris and Bill have said it all about strict, I won't add anything on that.

    I would strongly suggest that you also uncomment the use warnings; pragma. Warnings catch in a split second a lot of errors that you would otherwise have to find through long hours of debugging session. In brief, always enable strict and warnings (except possibly for one-liners).


    (This post was edited by Laurent_R on Jun 10, 2018, 11:22 PM)


    tc
    New User

    Jun 8, 2018, 3:05 PM

    Post #5 of 10 (148 views)
    Re: [Chris Charley] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    Quote by Laurent_R:
    Warnings catch in a split second a lot of errors that you would otherwise have to find through long hours of debugging session.
    [end quote]

    Got it Laurent! My question is; once you got your code running perfectly with ‘strict’ and ‘warning’ enabled, would it be ok to disable ‘strict’ and ‘warning’ for all future runs? Also in your reply you did mean 'ENABLE' not unable?



    ....................
    ....................
    Quote by BillKSmith:
    The pragma 'strict' disallows three things …
    [end quote]

    Thanks Bill, This tells me we must enable ‘strict’ and ‘warning’ during development, where we fix all error as possible. From that point -- depending on the final "type of warning" your code may be safe to run. I’ll be reading your reply until I can recite it.



    ....................
    ....................
    Quote by Chris Charley :
    I hardly work with HTML and it seems you are doing so here. So I'm sorry, but I can't recommend a good parser.
    [end quote]

    Chris, I am doing this offline for preparation of my HTML files and any other configuration file for use on my desktop and VPS so a parser might be handy after I complete this file.

    Anyway, that was a perfect evaluation! I followed your reply to the letter and it run with ‘strict’ and ‘warning’ enabled :) However, I had moved on minutes after posting and created a new problem. I constructed a procedure for the remove back characters section (chomp) and it work. Just like in the old days with assembler, near-complete separation to make updating and adding other proc easier. It’s like running two or more PERL scripts in one, if I get it right and provided it makes since.

    In the mean time I'm working a rm_frontchar proc to complete this file. I got one that I found to work in it own script but now it should work in a proc without issues. It's fun to be coding again... Actually, nothing more then deep searching, copy, paste, then some uncommon ways of testing why. That was how I learn to use the registers. But in my case it did not work for the next OS update. After the third re-write I caved in. Most High-Level will still work, expecially PERL and C!

    Just a few additional facts of why I choose PERL.

    Time to go back to work.

    Thanks a ton Chris!




    Code
    #!/usr/local/bin/perl 
    # use strict;
    # use warnings;
    # ---------------------------------- perl -pi.bak -e's/...//' /workspace/_1.txt
    # ---------------------------------- perl /workspace/GetString_Proc.pl
    my $find = "tbody";
    my $file1 = "/workspace/_1.txt";
    my $file2 = "/workspace/_2.txt";
    my $file3 = "/workspace/_3.txt";
    # .....................................................
    # ..................................................... Make Empty!
    open DD,">","/workspace/_1.txt" or die "Cannot overwrite file: $!"; print DD ""; close DD;
    open DD,">","/workspace/_2.txt" or die "Cannot overwrite file: $!"; print DD ""; close DD;
    # .....................................................
    # ..................................................... Open Files
    open (SUB, ">", "/workspace/_1.txt" ) or die "could not open:$!";
    open (BACK, ">", "/workspace/_2.txt" ) or die "could not open:$!";
    open (FRONT, ">", "/workspace/_3.txt" ) or die "could not open:$!";

    open (MY_FILE, "<", "/workspace/html.txt") or die "could not open:$!";
    # .....................................................
    # ..................................................... Strip out 1 tag
    while (<MY_FILE>) {
    print SUB $_ if (/$find/); }

    close (MY_FILE);
    # .....................................................
    # ..................................................... DONE: Clean up
    close (MY_FILE);
    close (SUB);
    close (BACK);
    close (FRONT);

    #########################
    my $proc_1 = [];
    $proc_1 = rm_backchar();
    eval join ("", @$proc_1);
    # ....................................
    # ....................................
    sub rm_backchar($)
    {
    open (OUT2, "> $file2");
    open (OUT3, "> $file3");
    open (HANDLE1, "< $file1");

    chomp(my $_str1 = <HANDLE1>);
    my $back = substr($_str1, 0, -28);
    close (HANDLE1);
    print OUT2 $back;
    close (OUT2);
    close (OUT3);
    }
    #########################

    print " Process exited with value " . ($? >> 8) . "\n";




    [NEW ERRORS]
    (~)s1$ perl /workspace/GetString_Proc.pl
    main::rm_backchar() called too early to check prototype at /workspace/GetString_Proc.pl line 36.
    Can't use string ("1") as an ARRAY ref while "strict refs" in use at /workspace/GetString_Proc.pl line 37.
    (~)s1$



    html.txt for example:
    <skip this --- !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict">
    <meta name=" skip this to ">
    <tbody id="target"><tr><td>remove this...Return only this sentence. ...remove-the-rest</tbody>
    <p id="gone."</p>
    <p id="gone."</p>
    <p id="gone!"</p>


    (This post was edited by FishMonger on Jun 10, 2018, 8:34 AM)


    BillKSmith
    Veteran

    Jun 8, 2018, 8:49 PM

    Post #6 of 10 (133 views)
    Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    It is true that removing strict and warning from a correct program would not introduce any new errors. But why would you want to? They do not add any runtime penalty. They do remain on duty for every edit you make in the future. There are runtime conditions that produce warnings that are ignored if warnings are not enabled. If you really intend to ignore some warning, disable warning in a limited scope, not the whole program.

    On another topic, you should not be using function prototypes. They appear to be similar to those in "C" and other languages, but they have a very different meaning! Their purpose is not to catch errors, but rather to allow you to override certain built-in functions which would otherwise be impossible.
    Good Luck,
    Bill

    (This post was edited by BillKSmith on Jun 8, 2018, 9:07 PM)


    BillKSmith
    Veteran

    Jun 9, 2018, 7:33 AM

    Post #7 of 10 (127 views)
    Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    I finally got a chance to study your code. I am not sure what the function rm_backchar() is intended to do. I can tell you what it does do and why that is a problem.

    Please remove the prototype '($)'. Even if you make it work, it does not do what you think.

    This function opens several files.
    Reads the first line of your input file.
    Removes the newline.
    Removes the last 28 of the remaining characters.
    Stores the result in the lexical variable $back
    Closes the input file
    Prints $back to the file _2.txt.
    Closes both output files.
    Returns the result of the last statement, i.e. the status (1) of the last close.

    Back in the main program, you store that in $proc_1 and then attempt to use it as an array reference. Perl interprets that a symbolic reference which is not allowed under strict.

    Without strict, this will return an empty array. Join returns a null string. Eval is expecting perl code. I do not know what you expect.
    Good Luck,
    Bill


    tc
    New User

    Jun 9, 2018, 5:27 PM

    Post #8 of 10 (123 views)
    Re: [BillKSmith] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    I can’t find the site that had the original sequence of the code. I made changes to suit my need as I did with15 others sites for other things. Here are some sites that talks about PERL procedures. I now have three instances of Opera with bars Fully Packed with websites about PERL and to me this is the best forum of them all and the only one I am now a member of because it’s Nice, Quiet and Easy and well designed like no other I have ever seen.


    https://en.wikibooks.org/wiki/Perl_Programming/Functions
    “For instance, a procedure may read and write to files.”


    https://www.tutorialspoint.com/perl/perl_subroutines.htm

    https://www.perlmonks.org/?node_id=826635



    HEY – HEY – HEY I found it!
    https://stackoverflow.com/questions/364842/how-do-i-run-a-perl-script-from-within-a-perl-script
    .

    but it did not work, but it did something that I needed anyway :)


    As the title say: run-a-perl-script-from-within-a-perl-script
    I originally wanted to do it this way too -> run-a-perl-script-from-within-a-perl-script
    But I also want to do it the way that I ended up with -> as a subroutine or procedure.

    Both ways would be Great!



    I thought rm_backchar would be a good name for the procedure to be called. (remove/delete back characters) but evidently it is not being called correctly. I think something is out of place.

    prototype '($)' is GONE! I actually forgot to delete it.

    Code
    my $proc_1 = []; 
    $proc_1 = rm_backchar();
    eval join ("", @$proc_1);
    # ................................
    # ................................
    sub rm_backchar($) {
    open (OUT2, "> $file2");
    open (OUT3, "> $file3");
    open (HANDLE1, "< $file1");

    chomp(my $_str1 = <HANDLE1>);
    my $back = substr($_str1, 0, -28);
    close (HANDLE1);
    print OUT2 $back;
    close (OUT2);
    close (OUT3); }



    (This post was edited by FishMonger on Jun 10, 2018, 8:35 AM)


    BillKSmith
    Veteran

    Jun 9, 2018, 7:19 PM

    Post #9 of 10 (110 views)
    Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post

    Minor point first: There is no such thing as 'PERL'. The language is called 'Perl'. The interpreter that implements it is called 'perl'.

    You have not done anything to fix the problem that perl thinks is a symbolic link.

    I think you are making a big mistake in trying to learn Perl by modifying examples which you do not understand. Spend the time and money to read an introductory Perl book. I recommend "Learning Perl". It is badly out of date, but everything in it still works. (You would be missing many cool new tricks, but you can learn them later.) Do not ignore perl's own documentation. Learn to use its documentation reading tool by typing:

    Code
    perldoc perldoc


    I still have no idea at all what you expect your eval to execute. What you have is not allowed under strict. That is good, because it does not make any sense at all!
    Good Luck,
    Bill


    Laurent_R
    Veteran / Moderator

    Jun 10, 2018, 11:28 PM

    Post #10 of 10 (58 views)
    Re: [tc] Remove any amount of char from front and back of a string in a file [In reply to] Can't Post


    In Reply To
    Quote by Laurent_R:
    Warnings catch in a split second a lot of errors that you would otherwise have to find through long hours of debugging session.
    [end quote]

    Got it Laurent! My question is; once you got your code running perfectly with ‘strict’ and ‘warning’ enabled, would it be ok to disable ‘strict’ and ‘warning’ for all future runs? Also in your reply you did mean 'ENABLE' not unable?

    Yes, I meant enable, not unable (I fixed the typo now).

    It really does no harm to keep strict and warnings enabled, so why would you want to disable them? You may actually catch things that would otherwise be harder to noticed (such as the input file having a wrong format).

     
     


    Search for (options) Powered by Gossamer Forum v.1.2.0

    Web Applications & Managed Hosting Powered by Gossamer Threads
    Visit our Mailing List Archives