dirwalk.py generator version of os.path.walk
Jim Dennis
jimd at vega.starshine.org
Thu Feb 28 05:00:41 EST 2002
More information about the Python-list mailing list
Thu Feb 28 05:00:41 EST 2002
- Previous message (by thread): dirwalk.py generator version of os.path.walk
- Next message (by thread): dirwalk.py generator version of os.path.walk
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In article <ac677656.0202271621.5a134d44 at posting.google.com>, Tom Good wrote: >jimd at vega.starshine.org (Jim Dennis) wrote in message news:<a5d36e$1daf$1 at news.idiom.com>... >> This function could probably use a bit of polishing, >> and it certainly could use some enhancement (some options to >> control if, and how we follow symlinks, to how to handle >> exceptions on listdir(), whether to be depth first, and an >> option to avoid crossing mount boundaries with os.path.ismount(), >> etc). >> However, it seems to work. >> dirwalk() simply takes an optional top level directory/path name >> as an argument and instantiates a generator which will walk down >> that tree and return every filename that it can access. >> It's late and I need sleep. So I'm just going to post this in >> it's rough (and probably buggy) form and let y'all thrash on it >> a bit. >> I guess there's some sort of statcache module that might let me >> cache the stat() tuples. I guess I'm implicitly incurring a stat() >> system call for each node by checking islink() and isdir() on it >> so it seems like I ought to cache that and make it available to >> my caller (without forcing them to make an additional stat system >> call). >> I hope that something like this (a simple dirwalk() or other >> greatly simplified alternative to os.path.walk()) makes it into >> Python 2.3 or later. >> #!/usr/bin/env python2.2 >> from __future__ import generators >> import os >> def dirwalk(startdir=None): >> if not startdir: >> startdir="." >> if not os.path.isdir(startdir): >> raise ValueError ## Is this the right exception? >> stack = [startdir] >> while stack: >> cwd = stack.pop(0) >> try: >> current = os.listdir(cwd) >> except (OSError): >> continue # Skip it if we don't have access >> for each in current: >> each = os.path.join(cwd,each) >> if os.path.islink(each): >> pass >> elif os.path.isdir(each): >> stack.append(each) >> yield(each) >> if __name__ == "__main__": >> # import unittest? >> # test suite should consist of: >> # dirwalk() vs. os.listdir() >> # dirwalk("/") vs. os.path.walk() >> # dirwalk("/etc/passwd") (should raise exception) >> import sys >> for i in sys.argv[1:]: >> for j in dirwalk(i): >> print j >> # should compare this to os.popen("find ....") and >> # or to os.path.walk(...) >Hi, > I wrote a different implementation of this general concept at: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/105873 > You don't really need to keep a stack of directories and push/pop > things, because with generators you can recurse instead. >Tom But recursion is likely to cost more. The only state I need to keep is my current "todo" list of directories. A recursion would store functional state (unless Python supported tail-end recursion). So the append/pop (total cost, 3 lines of code) seems like the lightest weight way to do this.
- Previous message (by thread): dirwalk.py generator version of os.path.walk
- Next message (by thread): dirwalk.py generator version of os.path.walk
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list