Wand is a wonderful library for working with images in python, give you all the potential of the imagemagick and the idiomatic of python.

But unfortunately it doesn't come with a video api, so there is no nice library on python to generate videos from images and everybody seems to use ffmpeg to do it. But how is the best way of doing it.

Ffmpeg is a very versatile, and certainly it has some commands that read a bunch of images files and converts to a mp4, mkv or whatever you want.

from glob import glob  
import json  
import base64  
import os  
from subprocess import call  
from wand.image import Image

cursor = Image(filename="img/cursor.png")

files = glob("*.dat")  
files = [(f, open(f, "r").read()) for f in files]  
counter = 0

for f, data in files:  
    data = json.loads(data)
    images = data["images"]
    images = [base64.b64decode(i) for i in images]
    for i in images:
        f = open("/Volumes/ramdisk/out/%03d.jpg" % counter, "w")
        f.write(i)
        f.close()
        img = Image(filename="/Volumes/ramdisk/out/%03d.jpg" % counter)
        img.composite(cursor, left=10 * i, top=10 * i)
        img.save(filename="/Volumes/ramdisk/out/%03d.png" % counter)
        img.close()

        counter = counter + 1

try:  
    os.remove("out.mp4")
except:  
    pass

ffmpeg_call = ("ffmpeg -r 10 -pattern_type glob -i /Volumes/ramdisk/out/*.png -vcodec mpeg4 -vf fps=10 out.mp4")  
call(ffmpeg_call.split(" "))  

This is a nice example of what you can find on stackoverflow about creating videos from images. First we create on a ramdisk (so there is no IO penalty) and then execute the ffmpeg with -pattern_type glob -i *.png

But we can do better... ffmpeg has support for pipes so you can be transcoding a file from a pipe and chaining various process. So what if we send the images through a pipe and get rid of the ramdisk.

from glob import glob  
import json  
import base64  
import os  
from subprocess import Popen, PIPE  
from wand.image import Image


cursor = Image(filename="img/cursor.png")

files = glob("*.dat")  
files = [(f, open(f, "r").read()) for f in files]  
counter = 0

ffmpeg_call = ("ffmpeg -threads 4 -r 10 -f image2pipe -s 1024x768 -vcodec mjpeg"  
               " -i -"
               " -vcodec mpeg4"
               " -vf fps=10 out.mp4")
ffmpeg_call = ffmpeg_call.split(" ")  
print ffmpeg_call  
proc = Popen(ffmpeg_call, stdin=PIPE, bufsize=0)

mother_bg = Image(width=1024, height=768)

for f, data in files:  
    data = json.loads(data)
    images = data["images"]
    images = [base64.b64decode(i) for i in images]
    bg = Image(image=mother_bg)
    bg.depth = 8

    for i in images:
        if len(i) > 0:
            img = Image(blob=i)
            img.composite(cursor, left=10 * i, top=10 * i)
            bg.composite(img, left=0, top=0)
            blob = bg.make_blob("jpeg")
            proc.stdin.write(blob)
        img.close()

        counter = counter + 1
    bg.close()

proc.stdin.close()  
proc.wait()  

There are some considerations to take into account before going this way...

  • Is necessary to define the output images size (mine were not all equal so I create a black image of 1024x768) and add loaded images into this black image

  • is necessary to define ffmpeg to use the mjpeg codec to the input (which is a video codec not and image but it works...)

This way it's much clean without messy images files everywhere, but still a bit slower. We can do better!

The performance issue comes from the fact that python is compressing the image to jpg, and ffmpeg is decompressing back...

from glob import glob  
import json  
import base64  
import os  
from subprocess import Popen, PIPE  
from wand.image import Image

cursor = Image(filename="img/cursor.png")  
cursor.depth = 8  
cursor.format = "rgb"

files = glob("*.dat")  
files = [(f, open(f, "r").read()) for f in files]  
counter = 0

ffmpeg_call = ("ffmpeg -r 10 -f rawvideo -s 1024x768 -pix_fmt rgb24"  
               " -i -"
               " -vcodec mpeg4 -q 0"
               " -vf fps=10 out.mp4")
ffmpeg_call = ffmpeg_call.split(" ")

proc = Popen(ffmpeg_call, stdin=PIPE)  
mother_bg = Image(width=1024, height=768)  
mother_bg.depth = 8  
mother_bg.format = "rgb"

for f, data in files:  
    data = json.loads(data)
    images = data["images"]
    images = [base64.b64decode(i) for i in images]
    bg = mother_bg.clone()

    for i in images:
        img = Image(blob=i)
        bg.composite(img, left=0, top=0)

        bg2 = bg.clone()
        bg2.composite(cursor, left=10 * ii, top=10 * ii)
        proc.stdin.write(bg2.make_blob())
    bg.close()

proc.stdin.close()  

The trick here is just flush the rgb data that is memory of wand, for doing that you need to be sure that every image you compose is in rgb format (which I believe is not the default one) if not when composing wand will convert between image formats and have a performance penalty.

Also the ffmpeg format now is rawvideo with the pixel format set to rgb24.

The performance boost is about 40%, Do you think there is a way to improve it? let me know on comments.