Get directory from http web site
Kent Johnson
kent37 at tds.net
Sat Aug 6 16:45:55 EDT 2005
More information about the Python-list mailing list
Sat Aug 6 16:45:55 EDT 2005
- Previous message (by thread): Replacement for keyword 'global' good idea? (e.g. 'modulescope'or'module' better?)
- Next message (by thread): Get directory from http web site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
rock69 wrote: > Hi all :) > > I was wondering if there's some neat and easy way to get the entire > contents of a directory at a specific web url address. > > I have the following link: > > http://www.infomedia.it/immagini/riviste/covers/cp > > and as you can see it's just a list containing all the files (images) > that I need. Is it possible to retrieve this list (not the physical > files) and have it stored in a variable of type list or something? BeautifulSoup and urllib do this easily: >>> from BeautifulSoup import BeautifulSoup >>> import urllib >>> data = urllib.urlopen('http://www.infomedia.it/immagini/riviste/covers/cp/').read() >>> soup = BeautifulSoup(data) >>> anchors = soup.fetch('a') >>> len(anchors) 164 >>> for a in anchors[:10]: ... print a['href'], a.string ... ?N=D Name ?M=A Last modified ?S=A Size ?D=A Description /immagini/riviste/covers/ Parent Directory cp100.jpg cp100.jpg cp100sm.jpg cp100sm.jpg cp101.jpg cp101.jpg cp101sm.jpg cp101sm.jpg cp102.jpg cp102.jpg http://www.crummy.com/software/BeautifulSoup/ Kent
- Previous message (by thread): Replacement for keyword 'global' good idea? (e.g. 'modulescope'or'module' better?)
- Next message (by thread): Get directory from http web site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list