Monday, November 1, 2010

Optimizing Loading Images in C++

Posted by Danny Tarlow
I'm working on a project where I need to load a lot of images into a C++ program, and it's taking up an annoyingly long amount of time relative to the rest of the program (the loading of data takes more time than running my algorithms), so I put in a couple hours to see if I could optimize it.

The basic setup is that for each example and iteration, I need to load around 100 images and iterate over all pixels. The images are each of size around 150x200 pixels. A typical full run of the full algorithm does around twenty iterations on a couple hundred examples (say 200). To produce the results I need, it will take 20 or 30 full runs. I can parallelize a lot, but I figured it was worth taking a pass optimizing my input/output code a bit first.

My initial implementation used the CImg library to load the images. It is a simple loop, iterating over filenames, loading the images, then iterating over the pixels to construct the model. For a single example, it takes about 6.7 seconds: 2.2 seconds to load the 100 images, and 4.5 seconds to iterate over them and construct the model. For a full run, that amounts to roughly 20 * 200 * 6.7 = 26800 seconds, or 7.4 hours. In reality, I usually split the work over 4 cores, so I can get results in ~2 hours.

The new version I am playing around with uses Google Protocol Buffers instead. Instead of loading each of the images separately, I write all of the pixel values of the 100 images into a protocol buffer, then load the single file instead of the 100 separate ones. For a single example, this cuts the time down to about 2.3 seconds: .3 seconds to load the data, and 2.0 seconds to iterate over the values and construct the model. It's not earth shattering, but it's still a nice speedup, cutting the input/output component of time for a full run down to about 9200 seconds, or 2.6 hours (2.6/7.4 = 35%).


David Tschumperlé said...

Why not putting all your images into a CImgList structure that you can save in a single file (with extension .cimg) ?
That would be equivalent to have a Protocol buffer I guess.

Danny Tarlow said...

Hi David,

Interesting. I didn't realize CImg had that feature. As you say, it'd probably give much of the same savings at the protocol buffers. Thanks for the pointer.

FuzzyLogic. said...

If you are loading the same set of images for each example - then you need to load it only once, during the initialization of the program and you can reuse the data for each example.

Also, if you know the access patterns on each of those images , it will be easier to construct a composite image which can make the looping faster; for e.g, if your access pattern is say, row-wise (process the 1st row, then 2nd,..) - You should construct a large 150x20000 size image and just go over them in one pass.