newb: BeautifulSoup
Stefan Behnel
stefan.behnel-n05pAM at web.de
Fri Sep 21 01:31:35 EDT 2007
More information about the Python-list mailing list
Fri Sep 21 01:31:35 EDT 2007
- Previous message (by thread): newb: BeautifulSoup
- Next message (by thread): calling extension's autoconf/make from distutils
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
TheFlyingDutchman wrote: > On Sep 20, 8:04 pm, crybaby <joemystery... at gmail.com> wrote: >> I need to traverse a html page with big table that has many row and >> columns. For example, how to go 35th td tag and do regex to retireve >> the content. After that is done, you move down to 15th td tag from >> 35th tag (35+15) and do regex to retrieve the content? > > Make the file an xhtml file (valid xml) if it isn't already and then > you can use software written to process XML files: > > http://pyxml.sourceforge.net/topics/ ... or just use software that can process XML and HTML the same way *and* that supports XPath and tree iteration so that you can easily select the content you want. http://codespeak.net/lxml/ Stefan
- Previous message (by thread): newb: BeautifulSoup
- Next message (by thread): calling extension's autoconf/make from distutils
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list