CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
extract content from table problem

 



johny_bravo
New User

Sep 24, 2011, 1:31 PM

Post #1 of 2 (3465 views)
extract content from table problem Can't Post

Hey everybody :)

I have a regular expression problem.
Sry for my bad english.
So i want to extract some text from my table,but i have a problem.

i want to "parse" my table


i want to parse table columns and put the results into array and then iterate over this array.

here is my html code



Code
 
<table border="0" cellspacing="0" cellpadding="0" bgcolor="yellow" width="100%">
<tbody><tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="5" border="0"></td>
</tr>
<tr>
<td background="./some_file/img.gif" class="columnTextBig" colspan="9" height="18" bgcolor="red">&nbsp;Some Table</td>
</tr>
<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>
<tr>
<td background="./some_file/img.gif" valign="middle" width="10">
<img src="./some_file/img.gif" alt="" width="10" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" valign="middle" width="20%" height="18">Column
1</td>
<td background="./some_file/img.gif" valign="middle" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" valign="middle" align="right">Column 2</td>
<td background="./some_file/img.gif" valign="middle" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" valign="middle" align="right">Column 3</td>
<td background="./some_file/img.gif" valign="middle" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
</tr>

<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>

<tr valign="top">
<td width="10"><img src="./some_file/img.gif" alt="" width="10" height="1" border="0"></td>
<td class="columnText" align="left">3,
3,
4,
5,
{"1000 1001 1000 1002"}</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td class="columnText" align="right">3</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td class="columnText" align="right">Ingore This</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
</tr>
<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>


<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>

<tr valign="top">
<td background="./some_file/img.gif" width="10">
<img src="./some_file/img.gif" alt="" width="10" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" align="left">0,
{"1000 1005 1000 1006"}</td>
<td background="./some_file/img.gif" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" align="right">14</td>
<td background="./some_file/img.gif" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td background="./some_file/img.gif" class="columnText" align="right">Ingore This</td>
<td background="./some_file/img.gif" width="5">
<img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
</tr>
<tr>
<td background="./some_file/img.gif" colspan="9">
<img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>


<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>

<tr valign="top">
<td width="10"><img src="./some_file/img.gif" alt="" width="10" height="1" border="0"></td>
<td class="columnText" align="left">
1,2,3,4,5,6
</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td class="columnText" align="right">14</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
<td class="columnText" align="right">Ingore This</td>
<td width="5"><img src="./some_file/img.gif" alt="" width="5" height="1" border="0"></td>
</tr>
<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>


<tr>
<td colspan="9"><img src="./some_file/img.gif" alt="" width="1" height="1" border="0"></td>
</tr>

</tbody></table>



i want to parse the values only from column1 and column2

something like this:

initialize $cnt = 0
foreach(extractedText) {

for $cnt = 0
=> print $valueLeft; // should be 3, 3, 4, 5, {"1000 1001 1000 1002"}
=> print $valueRight; // shouled be 3

for $cnt = 1
=> print $valueLeft; // should be 0, {"1000 1005 1000 1006"}
=> print $valueRight; // 14

for $cnt = 2

=> print $valueLeft; // should be 1,2,3,4,5,6
=> print $valueRight; // should be 14

increment $cnt;
}


Thanks in Advance!


FishMonger
Veteran / Moderator

Sep 24, 2011, 2:25 PM

Post #2 of 2 (3458 views)
Re: [johny_bravo] extract content from table problem [In reply to] Can't Post

Don't use a regex for parsing html. Use one of the html parsers on cpan.

Here are a few choices.
http://search.cpan.org/~gaas/HTML-Parser-3.68/Parser.pm
http://search.cpan.org/~msisk/HTML-TableExtract-2.11/lib/HTML/TableExtract.pm
http://search.cpan.org/~jfearn/HTML-Tree-4.2/lib/HTML/Tree.pm
http://search.cpan.org/~jfearn/HTML-Tree-4.2/lib/HTML/TreeBuilder.pm

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives