CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
how to get captcha image from HTML-code

 



mirvam
Novice

Aug 4, 2018, 1:43 AM

Post #1 of 8 (1899 views)
how to get captcha image from HTML-code Can't Post

How to get captcha image from HTML-code like this:
.....<img src='pic/captcha.png>


mirvam
Novice

Aug 4, 2018, 2:02 AM

Post #2 of 8 (1896 views)
Re: [mirvam] how to get captcha image from HTML-code [In reply to] Can't Post

do I need to tesseract OCR tool?


BillKSmith
Veteran

Aug 4, 2018, 9:03 AM

Post #3 of 8 (1884 views)
Re: [mirvam] how to get captcha image from HTML-code [In reply to] Can't Post

I assume that you have the URL of an HTML document. You want to display the image that the document specifies.

This requires several steps, some of which you probably have already done.


  • Download the document using any of several modules.


  • Extract the image field from the HTML. You could do this with a regex, but a module for parsing HTML is much better for production software.


  • Extract the image URL from the src attribute of the field.


  • This is a 'relative' URL. You need its base. This may be specified in the document. Finding it there is trivial if you already parsed the HTML. If it is not specified, you must use the same base the main document. [li]

  • Form the image URL by appending the relative URL to the base. Download the resulting image using the same module you used to download the document.


  • Use the 'system' function to execute your favorite image viewer.


  • Post a more specific question if you need help with any of these steps.
    Good Luck,
    Bill


    mirvam
    Novice

    Aug 6, 2018, 1:45 AM

    Post #4 of 8 (1866 views)
    Re: [BillKSmith] how to get captcha image from HTML-code [In reply to] Can't Post

    Thank you.
    Nov I've a path to the captcha and my task is to get a code displayed on the captcha. My code is:

    Code
    use Image::OCR::Tesseract 'get_ocr'; 
    $image = $ua->get("http://website.com/pic/captcha.png");
    $code = get_ocr($image);
    print $code;
    1;

    Output of the program is:

    Code
    Can't locate Image/OCR/Tesseract.pm in @INC (you may need to 
    install the Image::OCR::Tesseract module) (@INC contains:
    C:/Programs/Perl/perl/site/lib C:/Programs/Perl/perl/vendor/lib
    C:/Programs/Perl/perl/lib) at rrr.pl line 4.

    I tried to read tutorials
    https://metacpan.org/pod/distribution/Image-OCR-Tesseract/lib/Image/OCR/Tesseract.pod#TESSERACT-NOTES
    https://stackoverflow.com/questions/19431336/perl-imageocrtesseract-module-on-windows
    and install the Image::OCR::Tesseract module but I don't understand this procedure. Can you explain to me in detail how to install?


    (This post was edited by mirvam on Aug 6, 2018, 7:07 AM)


    mirvam
    Novice

    Aug 6, 2018, 7:06 AM

    Post #5 of 8 (1842 views)
    Re: [mirvam] how to get captcha image from HTML-code [In reply to] Can't Post

    I compiled:

    Code
    $url = "http://website.com/path/picture.png"; 
    $image = $ua->get($url);
    $code = get_ocr($image);
    print $code;

    Output is:

    Code
    Can't locate Image/OCR/Tesseract.pm in @INC (@INC contains: /usr/local/lib64/Perl5 user/local/share/Perl5/vend /usr/share/perl5/vend .) at file.pl line 3.



    (This post was edited by mirvam on Aug 6, 2018, 7:15 AM)


    BillKSmith
    Veteran

    Aug 6, 2018, 9:10 AM

    Post #6 of 8 (1832 views)
    Re: [mirvam] how to get captcha image from HTML-code [In reply to] Can't Post

    It looks like you have now done enough research to ask the right questions. This package appears to meet your needs and is thus worth the effort to install. The author admits that it can be difficult to install. (And this is on UNIX! It probably is not much harder on windows, but it is harder to get directions.) Your links explain that even after you succeed in installing it on windows, it still will not work without minor changes.

    Your @INC data suggests that you are not using windows. In that case, I would recommend you follow the installation notes in the modules own documentation.

    No one can give you detail step-by-step directions. The details depend not only on your OS, but on what tools you have installed and what prerequisites you may have installed already.

    I doubt that you will succeed until you have a basic understanding of perl's build system (and possibly the UNIX 'make' utility).

    The book "Intermediate Perl" has a Chapter "Creating Your Own Distribution". This could be a big help in understanding the files in your distribution and how they are used in the installation process. Be aware that this is heavily biased toward the UNIX user and it omits all mention of XS.
    Good Luck,
    Bill


    mirvam
    Novice

    Aug 6, 2018, 10:27 AM

    Post #7 of 8 (1829 views)
    Re: [BillKSmith] how to get captcha image from HTML-code [In reply to] Can't Post

    Thank you. I got a lot of useful information.


    mirvam
    Novice

    Aug 7, 2018, 10:15 AM

    Post #8 of 8 (1807 views)
    Re: [BillKSmith] how to get captcha image from HTML-code [In reply to] Can't Post

    Can you help me? I fill the form on web-site:

    Code
    <form method="post" action="?u=user&p=password">Username: 
    <input name="u" />
    <br/>password:
    <input name="p" />
    <br/>
    <input type="hidden" name="file" value="154.png" />Text:
    <input name="text">
    <br/>
    <input type="submit">
    </form>

    Perl code:

    Code
    use HTTP::Request 6.07;  
    $login= 'user';
    $pass = 'password';
    $text = 'text';
    my $form = $ua->put( $url, "u" => $login, "p" => $pass, "text" => $text );

    Is it correct?
    How can I get html-code of new html-page displayed after forwarding
    of the form?


    (This post was edited by mirvam on Aug 7, 2018, 10:24 AM)

     
     


    Search for (options) Powered by Gossamer Forum v.1.2.0

    Web Applications & Managed Hosting Powered by Gossamer Threads
    Visit our Mailing List Archives