Parallel processing batches of images with ImageMagick

29 04 2011

From my recent holiday trip to Lanzarote I brought back nearly 600 photographs. When I reviewed them, it became quite clear that my picture-taking skills are not that impressive and most of the images, if not all, need some kind of retouching, most notably, adjusting the white-balance.

The Gimp becomes more and more my friend, but processing all of the photographs manually would rather quickly wear out my mouse arm and index finger. Some kind of scripting would be nice here, as scripts also can be kept for later reference on what I did.

The Gimp lets you write scripts in some Scheme dialect, in Perl and in Python, of which the latter would be my choice. But (afaik) these _fu_-scripts can only be run inside of Gimp. More recent versions of Gimp already contain a plug-in for batch-processing, but that one does not offer all options I wanted.

A more than decent alternative is ImageMagick. At least for some steps it seems to be the right tool, like resizing etc. If it also is able to adjust whitebalance, I still have to try out.

Some tutorial about adjusting the white-balance manually mentioned to measure the values of the RGB channels. It would be nice if the scripts have these values handy. IM offers a nice tool to gather informations about images:

for a in $(ls all/*.JPG); do identify -verbose $a > info/$a.txt; done

Hm, even that took quite some time for all photographs, and I began to wonder how long will I have to wait when the actual jobs are running.

IM provides several ways to feed multiple files into its tools: one by one, when the shell expands glob patterns or, all at once, when the glob pattern is quoted and IM expands it itself ("inline" style).

The latter way is convenient as it scales well on multi-core CPUs. But wait, why is gkrellm's memory meter running higher and higher, and now the swap meter...??

Obviously, IM reads all files at once and processes them, before it frees any memory again. This might be useful for a fistful of dollars, ehm images, but 580 hi-res photographs are far too much for my RAM.

A custom solution is required, and the following bunch of shell scripts is what I came up with:

The topmost script contains parameters like where the source files are, where to put the processed files (aka target directory) and, of course the arguments which are passed to IM's "convert" command.


ARGS="-resize 1024x1024"

./batch_convert "$SRC" "$ARGS" $DIR_OUT

As you can see, this script calls "batch_convert" whose task it is to distribute all source files over several background tasks. The call to "build_file_lists" expands the file glob and distributes the file names into seven temporary files, which serve as input for the background jobs. For each single file list, "runconvert" is called, which does the actual processing. The rather big chunk at the end of the script just waits for all jobs to terminate before it removes the temporary files.

At first I thought, the methods to distribute the source files are needed in several scripts, so I put them into their own file ("").


# Runs batch of background processes
# $1 is glob pattern of files to process
# $2 is file with convert's args or string (quoted)
# $3 is output directory


. ./

build_file_lists "$SRC"
for list in $FILE_LISTS
	./runconvert $list "$ARGS" $DIR_OUT &

# Wait for all jobs to finish, so we can safely unlink the temporary
# file lists afterwards.
# CAVEAT: Somewhy, "jobs" always returns the last PID even if that job finished.
# Therefore we go the long road and look into /proc for our processes.
pids=$(jobs -p)
echo Waiting for jobs to finish...
echo I am PID: $$
echo PIDs spawned: $pids
while [ ! -z "$pids"  ]
	sleep 1
	pids=$(jobs -p)
	if [ "$prevpids" != "$pids" ]
		for a in $prevpids
			if [ ! -e "/proc/$a" ]
				echo PID $a finished
				pids=$(echo $pids | sed -re "s/$a//g")
		echo PIDs remaining: $pids

unlink_file_lists $FILE_LISTS

echo Finished in $SECONDS seconds.
# Distribute list of all image files into 7 lists.
# We can then use those smaller lists in batch operations
# and start 7 background processes in parallel.
# (We have 8 cores, keep 1 core free for ourselves ;-)
# $1 is glob pattern for source files
# Returns $FILE_LISTS as a list of the generated files.
	local SRC="$1"

	local TMPFILE="/tmp/$RANDOM-files"
	FILE_LISTS=$(echo $TMPFILE-{1..7})

	local n=1
	local MAX=7
	for a in $(ls $SRC)
		echo $a >> "$TMPFILE-$n"
		let "++n"
		if [ $n -gt $MAX ]

	for a in $@
		unlink $a

Lastly, here is the script that does the actual processing. It reads the source files from the given temporary list and feeds it one by one to IM's "convert". You may give the arguments for "convert" either literally or, as a filename. The script makes sure that you do not accidentally overwrite the source file.


# This script runs convert from ImageMagick
# $1 file that lists files to process
# $2 file with convert's arguments or all args as string (quoted)
# $3 is directory for output files

DIR_OUT=$3 && [ -z $3 ] && DIR_OUT="."

if [ -f "$ARGS" ]
	# Read convert's arguments from given file and skip comment lines
	# which start with `#'
	ARGS=$(grep -v '^\s*#' $ARGS)
for f_in in $(cat $LIST)
	f=$(basename $f_in)
	if [ "X$f_in" == "X$f_out" ]
		echo "W: Skipping $f_in. Not overwriting input file"
		#echo $CONVERT $f_in $ARGS $f_out
		$CONVERT $f_in $ARGS $f_out

Put all scripts into the same directory. For example, if you start the first script, you may see sth. like this:

$ time ./resize_to_1024 
Waiting for jobs to finish...
I am PID: 27667
PIDs spawned: 27670 27671 27672 27673 27674 27675 27676
PID 27671 finished
PID 27676 finished
PIDs remaining: 27670 27672 27673 27674 27675
PID 27672 finished
PID 27674 finished
PID 27675 finished
PIDs remaining: 27670 27673 27676
PID 27670 finished
PID 27673 finished
PID 27676 finished
PIDs remaining:
Finished in 102 seconds.

real	1m41.540s
user	11m57.101s
sys	0m25.478s

In case you need to abort the script while the background processes are still running, take them out either one by one by their PID, which is shown or, just run "killall runconvert". You need to unlink the temporary files with the image lists then. They are here:

ls /tmp/*-files-?

Have fun. (Hmm, this sentence is so SuSe, maybe I should better say: these scripts have no super cow powers ;-)


No Trackbacks


Display comments as (Linear | Threaded)
No comments

Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Use the menu below to choose a header image. You can choose between images included with the theme or use a custom image.

If you are using both sidebars and you want to use an included image then choose one ending with "_large.jpg". For ex: red_sky_large.jpg instead of red_sky.jpg

To choose a custom image, select "custom header path" from the menu list and then click on the "media library" link. A custom header image must be 780x95 in size if you are using only one sidebar or 1000x95 if you are using both sidebars.

Please choose a header image: OUT ); define("FRESHY_CUSTOM_HEADER","Custom header path:"); define("FRESHY_HOMELINK_DESCRIPTION","Enter a custom label for the link pointing to your blog front page: "); define("FRESHY_NAVLINK_TITLE","Navigation link "); define("FRESHY_NAVLINK_DESCRIPTION_URL","Enter URL for Navigation link "); define("FRESHY_NAVLINK_ANCHOR_TEXT","Enter label for Navigation link "); define("FRESHY_NAVBG_TITLE","Navigation highlight colour"); define("FRESHY_NAVBG_DESCRIPTION","Please choose a navigation button highlight colour for this theme: "); define("FRESHY_NAVBG_GREEN","Green"); define("FRESHY_NAVBG_RED","Red"); define("FRESHY_NAVBG_LIGHT_BLUE","Light Blue"); define("FRESHY_NAVBG_BLUE","Blue"); define("FRESHY_NAVBG_PURPLE","Purple"); define("FRESHY_NAVBG_GRAY","Gray"); define("FRESHY_NAVBG_ORANGE","Orange"); define("FRESHY_NAVLINK_HOME","Home"); //only used for s9y versions <1.1 ?>