Threading advantages (was Re: Python's biggest compromises)

enoch enoch at gmx.net
Thu Aug 7 10:01:37 EDT 2003
aahz at pythoncraft.com (Aahz) wrote in message news:<bgs4ud$h3g$1 at panix3.panix.com>...
> <snip>
> Since, as you say, you've done some research, that's why I flamed you.
> There's just no call for making such an overstated claim -- it is *NOT*
> "a little bit exaggerated".

Well, I based this phrase on the fact that while under some
circumstances (e.g. your web spider) python does scale somewhat, under
others (e.g. zope) it may perform even worse on a SMP system. If you
sum these two facts up ...


> <snip IPC>
> >Here are some sources which show that I'm not alone with my assessment
> >that python has deficiencies w.r.t. SMP systems:
> 
> That I won't argue.  But Python's approach also has some benefits even
> on SMP systems.  And if you choose a multi-process approach, the same
> advantages that accrue to Python's approach on a single-CPU box apply
> just as much to an SMP system.

Yes, and these advantages also include a simpler threading model, as
far as I understand it, on every system. It's a compromise, that's why
I posted in this thread.

> 
> >http://www.python.org/pycon/papers/deferex/
> >"""
> >It is optimal, however, to avoid requiring threads for any part of a
> >framework. Threading has a significant cost, especially in Python. The
> >global interpreter lock destroys any performance benefit that
> >threading may yield on SMP systems, [...]
> >"""
> 
> Just because it's a published PyCon paper doesn't mean that it's correct.
> The multi-threaded spider that I use as my example is a toy version of a
> spider that was used on an SMP box.  (That's why I became a threading
> expert in the first place -- Tim Peters probably remembers me pestering
> him with questions four years ago. ;-)  I guarantee you that SMP made
> that spider much faster.

But how big is the significance of software which has the same
characteristics as your web spider example versus application servers?

> >So, although python is capable of taking advantage of SMP systems
> >under certain circumstances (I/O bound systems etc. etc.), there are
> >real world situations where python's performance is _hurt_ by running
> >on a SMP system.
> 
> Absolutely.  But that's true of any system with threading that isn't
> designed and tuned for the needs of a specific application.  Python
> trades performance in some situations for a clean and simple model of
> threading.

Again, the compromise we were talking about. I'm not in a position to
weigh the pros and cons of it against each other, but I think I can
point out some cons of the current approach. I'm not doing that to
spread FUD, but to give an outsiders perspective on what I think might
hurt python in the future, and I want python to thrive because I like
using it alot.
 
> >Btw. I think even IPC might not help you there, because the different
> >processes might bounce betweeen CPUs, so only processor binding might
> >help.
> 
> My understanding that most OSes are designed to avoid this; I'd be
> interested in seeing some information if I'm wrong.  In any event, I do
> know that IPC speeds things up in real-world applications on SMP boxes.

For example, there are always lots of discussions about CPU affinity
on linux-kernel, and it seems to be a hard problem. Hyperthreading and
other non-symmetric architectures make this problem even harder.
Add to that the problem of the GIL getting shuffled around and you
have a system where you'll have trouble to predict the performance
characteristics. Admins don't like that. Though, it's not like there
are no problems without the GIL, it  just adds to the complication.

> >I did quite a bit of googling on this problem - several times -
> >because I'm selling zope solutions. Sometimes, the client wants to run
> >the solution on an existing SMP system, and worse, the system has to
> >fulfill some performance requirements. Then I have the problem of
> >explaining to him that his admins need to undertake some special tasks
> >in order for zope to be able to exploit the multiple procs in his
> >system.
> 
> Even if Zope is the 800-pound gorilla of the Python world, Python isn't
> going to change just for Zope.  If you want to talk about ways of
> improving Zope's performance on SMP boxes, I'll be glad to contribute
> what I can.  But spreading false information isn't the way to get me
> interested.

I wasn't even aware that zope is the "800-pound gorilla" of the python
world. I used it just as an example for a typical larger server app,
because, well, I know it.
incidentally, the pycon paper above, which you seem to dismiss as
false, is also from a guy which is working on a larger server app.
Maybe there's a pattern?

> Keep in mind that one reason IPC has gained popularity is because it
> scales more than threading does, in the end.  Blade servers are cheaper
> than big SMP boxes, and IPC works across multiple computers.

Allow me some comment of the nature of this discussion (python and SMP
in general, not just this thread). I've seen it before and the
ingredients are:

- a major open source project 
- developers which love this project
- some "outsider" which points out some perceived deficiency of said
project
- said developers pointing out (rightly or wrongly) reasons why this
deficiency doesn't matter, or that there are other (better) ways for
the "outsider" to achieve what he wants

In most cases this discussion then develops in to a big fat flamewar
;).

Two examples are linux and its threading capabilities, and mysql and
ACID compliancy.
A nice quote from the linux discussion btw. was from Alan Cox:

"A Computer is a state machine. Threads are for people who can't
program state machines."

But today, linux' thread support is magnitudes better than it was.

You wrote in another message in this thread:
> Well, that's a good question.  *Does* Java have better threading
> performance than Python?  If it does, to what extent is that performance
> bought at the cost of complexity for the programmer?  

While I can't comment on the second question, here's an article which
sheds some light on the SMP scalability of an older java JDK, the meat
is on the third page:
http://www.javaworld.com/javaworld/jw-08-2000/jw-0811-threadscale.html

Seems that java does indeed have better threading performance than
python.




More information about the Python-list mailing list