Simple thread pools
Josiah Carlson
jcarlson at uci.edu
Mon Nov 8 14:53:18 EST 2004
More information about the Python-list mailing list
Mon Nov 8 14:53:18 EST 2004
- Previous message (by thread): Simple thread pools
- Next message (by thread): Simple thread pools
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Steve Holden <steve at holdenweb.com> wrote: > > Josiah Carlson wrote: > > > Jacob Friis <lists at debpro.webcom.dk> wrote: > > > >>I have built a script inspired by a post on Speno's Pythonic Avocado: > >>http://www.pycs.net/users/0000231/weblog/2004/01/04.html#P10 > >> > >>I'm setting NUM_FEEDERS to 1000. > >>Is that crazy? > > > > > > Not crazy, but foolish. Thread scheduling in Python reduces performance > > beyond a few dozen threads. If you are doing system calls (socket.recv, > > file.read, etc.), your performance will be poor. > > > Is this speculative, or do you have some hard evidence to support it? I > recently rewrote a billing program that delivers statements by email. > The number of threads it uses is a parameter to the program, and we are > currently running at 200 with every evidence of satisfaction - this > month's live run sent something over 10,000 emails an hour. There is a slowdown (perhaps 'poor' was a bad description). >>> i = 1 >>> import os >>> while i < 256: ... t = os.system('test_thread1.py %i'%i ... i *= 2 ... 0.0 8.45300006866 204800000 0.0 7.625 204800000 0.0 9.65600013733 204800000 0.0150001049042 11.2969999313 204800000 0.0159997940063 15.8280000687 204800000 0.0780000686646 16.6719999313 204800000 0.172000169754 17.2029998302 204734464 0.125 18.7189998627 204734464 >>> Back in the days of Python 2.0, I had written what would now be called a P2P framework. I initially used blocking threads for communication, and observed that as my number of connections and threads increased, I saw a marked reduction in throughput, and an increase in latency (even on a local machine). In switching to an asynchronous framework (heavily derived from asyncore), I ended up with a system that had nearly constant throughput regardless of the number of connections. > > > >>Are there a better solution? > > > > > > Fewer threads. Try running at 10-30. If you are finding that you > > aren't able to handle the load with those threads, then your > > processor/disk/etc isn't fast enough to handle the load. > > > I'm tempted to say "rubbish", but that would be rude, so instead I'll > just ask for some evidence :-). Don't forget that in network-based tasks > the time spent waiting for connection turnarounds can dominate the > elapsed time for execution - did you perhaps overlook that? Evidence has been provided. - Josiah #test_thread1.py import socket import time import threading import sys import os paircount = int(sys.argv[1]) c = threading.Condition() l = threading.Lock() ds = 0L def reader(n, p): o_r = os.read global ds c.acquire() c.wait() c.release() ld = 0 for i in xrange(n): ld += len(o_r(p, 1024)) l.acquire() ds += ld l.release() s = 1024*'\0' def writer(n, p): o_w = os.write global ds c.acquire() c.wait() c.release() ld = 0 for i in xrange(n): ld += o_w(p, s) l.acquire() ds += ld l.release() count = 100000 blks = count/paircount for i in xrange(paircount): r,w = os.pipe() threading.Thread(target=reader, args=(blks, r)).start() threading.Thread(target=writer, args=(blks, w)).start() time.sleep(1) t = time.time() c.acquire() c.notifyAll() c.release() print time.time()-t, t = time.time() while len(threading.enumerate()) > 1: time.sleep(.05) print time.time()-t, ds
- Previous message (by thread): Simple thread pools
- Next message (by thread): Simple thread pools
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list