Multi-processing map() for Python

I think that Python should use multi-processing and/or multi-threading to take advantage of as many opportunities for parallel execution as possible. To this end, I've written a drop-in replacement for map() that runs across as many processes as requested. It should be otherwise identical in every way the built-in version (and if it's not please let me know!).

I also wrote a version based on Parallel Python that is a lot simpler but not quite identical to the original. In particular, it returns a generator instead of a list of values so that program execution doesn't block until the results are fetched.

Drop me a line if you find this interesting or useful or just plain dumb.

Trackback URL for this post:

http://honeypot.net/trackback/38
AttachmentSize
forkmap.py4.39 KB

Optimized forkmap

Your forkmap is pretty neat. I made an optimized forkmap, which runs with less overhead, and I modified the API so that the number of processors is a keyword arg to forkmap.map(..., n=nprocessors), defaulting to the number of processors on the box, because this seemed to be slightly more terse and less confusing than the decorator (to me, at least). Feel free to recycle any of that code into your forkmap. Anyway, neat idea. - Connelly

Thanks!

Very cool, and thanks! By the way, you can use something like this to get the number of CPUs on a BSD system, including OS X:

try:
    int(os.popen('sysctl 2>/dev/null -n hw.ncpu').read()[:-1])
except ValueError:
    return 1

The popen().read() should return an integer + '\n'. If it returns anything else, either sysctl is missing or you're not on a BSD.

Optimization...

I should've noted that my forkmap pre-allocates each part of the work to each of the processors, and doesn't try to actively reschedule work if a few processors finish early; instead those processors just idle. This can be an optimization if the list being mapped is long, as the communication overhead of scheduling which processor should do which work can become significant. But it can also slow things down, when processors are idling in the "endgame." Smarter code could probably use the advantages of both of these methods, by doing communication only in the endgame, and only if worthwhile. - Connelly

Threadmap

And here's a multithreaded map:
http://www.connellybarnes.com/code/python/threadmap
I coded it slightly differently than Andrey Nordin's thread-map ( http://abstracthack.wordpress.com/2007/09/05/multi-threaded-map-for-pyth... ), because I'm calling CPU-bound C programs from Python, so by default, I set the code to map across a number of processors equal to the number of cores.

can't get it working under my environment (Windows Vista + cygwi

Hi Kirk,

thank you for the forkmap.
I can't get it working under my environment (Windows Vista + cygwin + cygwin's python 2.5.1).
Please find a error below:

D:\Temp\d>python forkmap.py
[16, 20, 24, 28]
Traceback (most recent call last):
File "forkmap.py", line 194, in
print map(busybeaver, range(27))
File "forkmap.py", line 137, in map
sendmessage(toparent, (childnum, index, excvalue))
UnboundLocalError: local variable 'index' referenced before assignment

and hangs here :(

In my application the error is different though. It is similar to this:

>>> forkmap.map(lambda x:x*10, [1,2,3,4])
Traceback (most recent call last):
File "", line 1, in
File "forkmap.py", line 81, in map
return __builtins__.map(function, *sequence)
AttributeError: 'dict' object has no attribute 'map'

You need to add @parallelizable() to your function def

like so:

@parallelizable(4)
def descramble(scrambled):
...

I was getting the AttributeError: 'dict' object has no attribute 'map' error as well until I did that.

Also, I think the underlying problem is a bug in the code. Adding "import __builtin__" and changing __builtins__ to __builtin__ made it work correctly when parallelization is not in use.

Might be a Windows thing

The forkmap module uses two Unix-native functions: fork() and pipe(). I haven't used Python on Windows enough to know whether those are implemented there or if they work the same way.

It might also be a Cygwin glitch, because that AttributeError exception seems really odd to me. Have you tried installing the official Python for Windows from http://www.python.org/download/ ?

Multi-threaded map()?

The idea of multi-processing map() is quite nice. And what about multi-threaded one? Threads usually cause less overhead than processes. If a mapping function is quite side-effect free (even if it does some HTTP GETs — they are idempotent), you don't rely on a parallel execution model you've selected. And when it isn't, then such an approach is error-prone. See also my blog entry.

Big locks

From my reply on your blog:

The reason I wrote that using processes and not threads is that Python uses a global lock around object access, so the current implementation might be a bit lacking in performance.

For even better results, consider using something like NetWorkSpaces to farm out requests to machines on your network. I wrote "servers" that accept an image filename and a list of operations to perform on it, pull that image from the fileserver, run the operations, and return the result (as a string) via a NWS variable. Performance improvements scaled linearly with the number of cores running servers on our network. Need it to run faster? Launch a few more instances. I have big dreams of farming out certain processes to the mostly-idle desktop machines sitting throughout the office.

GIL info

You might be interested in learning more about GIL, especially in connection with the recent Guido's post. Here is one of the latest resources on this subject.

Although I see Guido's point...

I understand the problems Guido describes and I could see why it might be (even impossibly) hard to remove GIL from CPython. Still, that doesn't change my mind that GIL is going to massive hamper it on big-SMP machines. Other natively threaded languages can do great things on that hardware, while a threaded Python app has to slog along on a single core.

Here's to hoping that he changes his mind.

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
nine minus four equals
Solve this math question and enter the solution with digits. E.g. for "two plus four = ?" enter "6".

Powered by Drupal - Modified by Danger4k