Posted by Danny TarlowI'm at the stage in one research project where I've done all of the brainstorming, derivations, and prototyping, and most indicators suggest that we might have a good idea. So the next step is to figure out how it works on a lot of real-world data. For me, this usually means running the program on lots of different inputs and/or with many combinations of parameters (e.g., for cross validation). Further, right now I'm working a lot with images and doing computations at the individual pixel level -- which can get pretty expensive pretty quickly if you're not careful.
I started this phase (as I usually do, actually) by re-writing all of my code in C++. I like to do this first to speed things up, but it also is useful to me as a debugging and sanity check stage, because I have to go through everything I've done in great detail, and I get to think about whether it's still the right idea.
Anyhow, when the task just involves lots of executions of an executable with different inputs, there's no real need to do any fancy parallelization. Instead, I use a dead simple version, where a script spawns off N threads, waits for them to finish, then spawns off N more threads. It keeps doing this until it has exhausted the stack of commands.
Now I know there are plenty of other ways to do this, but for my purposes, the following Python code works great for me.
import os from threading import Thread NUM_THREADS = 7 class ThreadedCommand(Thread): def __init__(self, shell_call): Thread.__init__(self) self.shell_call = shell_call def run(self): os.system(command) commands =  for parameter_1 in [5, 25, 100]: for parameter_2 in [5, 25, 100]: for parameter_3 in [.1, .25, .35]: command = "./command do_stuff --parameter_1 %s --parameter_2 %s--parameter_3 %s" % (parameter_1, parameter_2, parameter_3) commands.append(command) while len(commands) > 0: running_threads =  for i in range(NUM_THREADS): try: command = commands.pop(0) new_thread = ThreadedCommand(command) running_threads.append(new_thread) new_thread.start() except: continue for thread in running_threads: thread.join() print "Joined %s" % thread.shell_call