Optimizing code
Harald Hanche-Olsen
hanche at math.ntnu.no
Thu Feb 24 17:09:53 EST 2000
More information about the Python-list mailing list
Thu Feb 24 17:09:53 EST 2000
- Previous message (by thread): Phyththon misspelling contest
- Next message (by thread): Optimizing code
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
+ Gerrit Holl <gerrit.holl at pobox.com>: | class DiskUsage: | __size = 0 | def add(self, filename): | self.__size = self.__size + os.path.getsize(filename) | def __call__(self, arg, d, files): | for file in files: | filename = os.path.join(d, file) | if os.path.isfile(filename): self.add(filename) | | def __len__(self): | return self.__size | | def du(dir): | disk = DiskUsage() | os.path.walk(dir, disk, ()) | return len(disk) [...] | Timing turns out that the 'os.path.walk' part takes about 2.7 | seconds, for a 400 MB dir with 1096 dirs and 9082 files. 'du -s ~' | takes 0.2 seconds. What makes this slow? The special methods? The | redefinition of an integer? os.path.walk? With longs, it even takes | 12 seconds... One thing that slows your code down, is that it calls stat() three times on every regular file in the tree: First, in os.path.isfile, second, in os.path.getsize, and third, in os.path.walk, which needs to find out if a filename corresponds to a directory or not. | Can I optimize it? If so how? Here is my best effort so far. It is nearly three times as fast as yours (but less portable perhaps). Well, actually yours didn't work at all on my system, because the length of a file is a long integer: File "du.py", line 21, in du return len(disk) TypeError: __len__() should return an int #! /usr/bin/env python import sys import os import stat class DiskUsage: def __init__(self): self.__size = 0 def __call__(self, dir): # Importing these names is possibly a useless optimization: from stat import S_ISDIR, S_ISREG, ST_MODE, ST_SIZE files = os.listdir(dir) dirs = [] for file in files: filename = os.path.join(dir, file) s = os.lstat(filename) mode = s[ST_MODE] if S_ISDIR(mode): dirs.append(filename) elif S_ISREG(mode): self.__size = self.__size + s[ST_SIZE] for dir in dirs: self(dir) def len(self): return self.__size def du(dir): disk = DiskUsage() disk(dir) return disk.len() def main(): if len(sys.argv) != 2: sys.stderr.write("usage: %s <filename>" % sys.argv[0]) sys.exit(1) print du(sys.argv[1]) if __name__ == '__main__': main() -- * Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/> - "There arises from a bad and unapt formation of words a wonderful obstruction to the mind." - Francis Bacon
- Previous message (by thread): Phyththon misspelling contest
- Next message (by thread): Optimizing code
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list