jameshunt (.us)

Scanning Books

A while back, I stumbled across an intriguing DIY community: Book Scanners. They have lots of books. I have lots of books. They want to digitally archive their books. I …

I also want to digitally archive my books, even if I didn’t know it before a few months ago.

Fast-forward to last weekend, when I got my workshop all in order:

A long shot of the shiny new workbench in the basement.
It’s a mess, but it’s my mess.

You can actually see my first attempt at a book scanning rig, over there on the left. It’s little more than a cardboard box, cut diagonally and opened up, with two cheap (~$60) digital cameras on even cheaper (~$15) mounting arms. Here’s a closer view:

A close-up view of the book scanning rig.

The concept is pretty straightforward: the book lays flat on the cardboard box, at (roughly) a 100° angle, and each camera is pointed at the opposite page. To help the book lay flat, we put two pieces of glass / lexan — I’m using the plates out of some 11×14″ picture frames I picked up on clearance at a craft hobby store — the cameras can see through the glass / lexan and if we angle the light juuuust right, there’s no glare.

While you can run the camera’s on battery, and regularly re-charge them, I opted to go for the hard-wired approach, with two of these:

The HQRP Ac Adapter, which provides power through a battery cartridge that essentially plugs into the wall.

These AC adapters plug into the wall and provide a “blank” battery cartridge that delivers electrons on the same contacts as a real battery. The camera has no idea we’ve tricked into being on all the time!

For control, I’m using a combination of CHDK — the Canon Hack Development Kit, and a Raspberry Pi. If I’d had my druthers, I’d have used a Pi model with enough onboard USB ports (i.e. 2 or more), but all I had on hand was a Model A+ v1.1:

Picture of a Raspberry Pi Model A+, which powers the rig.

The Pi is hooked up to an external USB hub to get to port capacity. Here’s a rough block diagram:

Diagram of what connects to what, showing that the Pi has a USB hub, which in turn connects to each Canon ELPH160, which are themselves plugged into the wall.

The Software

With the hardware wired up and ready to go, it’s time to talk about the software bits that will make the whole thing scan. The trick with book scanning is to keep the “fiddling with the rig” to a minimum. For that, I’m using PTP – Picture Transfer Protocol, a means of remotely controlling a modern(-ish) digital camera over a USB link. The excellent (and aptly named!) software package chdkptp makes it easier to interact with PTP-enabled CHDK platforms.

For this to work, the cameras need to be running 1.2 of the CHDK firmware. CHDK is a fascinating project in its own right — you burn an image into an SD card, lock it, load it, and boot up the camera. The camera notices that the SD card is locked, and boots off of the card, instead of its onboard ROM. It’s kind of like jailbreaking an iPhone, without all the stress.

The version of CHDK that I’m loading has PTP support built right in, so it can immediately take commands over the connected USB bus. Yes, that’s a bit redundant (universal serial bus bus), but what was I supposed to call it? the US bus?

A photograph of a microbus which has had an American flag poorly photoshopped over it as a "paint job"
America, Bus Yeah!!1

The chdkptp package, on the other hand, provides the client driver for the PTP conversation. With it, we can issue commands across the USB link (yeah, that works better) and make the cameras do our bidding. Here’s roughly how it works:

$ sudo ./chdkptp-r964/chdkptp.sh -r -elist
-1:Canon PowerShot ELPH 160 b=001 d=022 v=0x4a9 p=0x32aa s=...
-2:Canon PowerShot ELPH 160 b=001 d=021 v=0x4a9 p=0x32aa s=...

That, however, is way too much to type, and my fingers are itching to turn pages, not mash keys. Here’s a wrapper script:

$ cat ~/bin/ptp
set -e
exec sudo $HOME/chdkptp-r964/chdkptp.sh -r -e"$@"

$ ptp list
-1:Canon PowerShot ELPH 160 b=001 d=022 v=0x4a9 p=0x32aa s=...
-2:Canon PowerShot ELPH 160 b=001 d=021 v=0x4a9 p=0x32aa s=...

Much better.

(By the way, those USB device numbers (d019, d020, etc.) will change every time the camera power cycles; as we build out the rest of the automation / control software, we’ll definitely need to keep that in mind)

PTP defines an operation called remote shoot, which will let us take a picture without physically touching the camera body. That’s key, because those stand arms aren’t the most rock solid things in the world.

$ cat shoot
#!/usr/bin/env ptpx
connect -b=001 -d=021
rs test-image

I should mention that the chdkptp binary is a bit awkward to use at times, and I have written this ptpx wrapper to emulate the behavior of more UNIX-y shell / command interpreters like Bourne Shell:

$ cat ~/bin/ptpx
if [ "x$1" = "x" ]; then
  exec sudo $HOME/chdkptp-r964/chdkptp.sh -r -i
  exec sudo $HOME/chdkptp-r964/chdkptp.sh -r -e"source $1"
return 7

Without arguments, ptpx starts up an interactive (-i) shell. With arguments, it sources the first argument (ignoring the rest) in a sense executing its argument. Just like a real shell!

Time to shoot some photos!

$ ./shoot
connected: Canon PowerShot ELPH 160, max packet size 512
ERROR: not in rec mode
ERROR: error on line 3

Most cameras operate in one of two modes: playback and record. In playback mode, you can browse through the photos stored on the camera, muck with settings, change the time, and more. In record mode, the backpanel LCD turns into a pixelated viewfinder, and the camera can actually produce output image files from what’s on the lens.

We want record mode:

$ cat shoot
#!/usr/bin/env ptpx
connect -b=001 -d=021
rs test-image

$ ./shoot
connected: Canon PowerShot ELPH 160, max packet size 512
(flash goes off, camera chimes, and boom! photograph)

$ ls -lah test-image.jpg
-rw-r--r-- 1 root root 4.2M Apr 21 02:08 test-image.jpg

Now we have a working point-and-shoot, which we can access over SSH. Let’s double the fun!

$ cat capture
set -eu

n=$1 ; shift
for cam in "$@"; do
  cat <<EOF | ptpi &
connect -b=001 -d=$cam
rs img-$n-from-$cam

This script is a bit more involved, but it builds on the same concepts. The ptpi script, however, is new:

f=$(mktemp ptpXXXXXXXXX)
trap "rm -f ${f}" QUIT TERM INT EXIT
cat >$f
ptpx $f
exit $?

It takes standard input (the “i” stands for “input”), stuffs it into a file on disk, and then calls the ptpx script on it. At the end of the execution, come hell or high water, the temporary file is removed. Back to capture!

$ capture 1 21 22
connected: Canon PowerShot ELPH 160, max packet size 512
connected: Canon PowerShot ELPH 160, max packet size 512

Now if we look in the current working directory, we can plainly see our two images:

$ ls -lah
-rw-r--r-- 1 root root 4267460 Apr 21 02:11 img-1-from-21.jpg
-rw-r--r-- 1 root root 5745788 Apr 21 02:11 img-1-from-22.jpg
Combined image of both photographs, one from camera 21 and the other from camera 22, before any image modification.
Camera 21 (left) and Camera 22 (right); before rotation and page boundary correction.

Eventually, we’ll want to run these through some computer vision code to find page boundaries, correct for skew and warp, re-order the pages, and pop out a PDF. I’m still working on all of that; so check back soon!