# When in doubt, don’t blur it out

Yesterday, The Guardian published an article about a victim, with an photo of a letter that had been sent to them. To preserve the privacy of the victim, the address on the letter had been blurred. However, I was able to completely recover the address, complete with the superscript “th” in the street number! The Guardian doxxed the victim they were writing about.

Blurring is often used to redact sensitive content. There’s apparently even a phrase, “if in doubt, blur it out”. But, counterintuitively, blur can be completely inverted to recover the original image! I won’t show you The Guardian’s example; instead, here’s an example I created:

You see the original, then a blurred version, then a version recovered from this. I used a tool called SmartDeblur, and the author, Vladimir Yuzhikov, has a great blog post on how it works. But it’s complicated, so below I look at a simpler model for how deblur can work.

Consider a blur function that works on one-dimensional images. Each pixel `b[i]` in the output blurred image is generated by taking the average of three pixels in the source image: the corresponding pixel `s[i]` in the source, and its two nearest neighbors `s[i-1]` and `s[i+1]`. This is a one-dimensional equivalent of a “bokeh” or lens blur, which averages all the pixels in a circle. This is the blur type that The Guardian used. To deal with the edges, we say that out-of-bounds pixels in the source are white.

Given the blurred output from this function, can we recover the original? Yes, if only we make a guess at the border of the source image. Let’s work from the left-hand side of the image. We know `s[-1]` is white, because it’s out of bounds. Let’s assume `s[0]` is white; this is our border guess. Then we can recover `s[1]` from `b[0]`. We march left to right to recover the rest, using `s[i] = 3*b[i-1] - s[i-1] - s[i-2]`.

This model generalizes to any size blur. We just have to guess more border, e.g. if each blurred pixel comes from n=7 input pixels, we must guess at a 3-pixel border. Here’s the general algorithm in JavaScript:

``````function deblur(n, borderGuess, blurred) {
const m = (n-1)/2;
const out = [];
for (let i = 0; i < m; i++) out[i] = borderGuess[i];
for (let i = m; i < blurred.length; i++) {
out[i] = n*blurred[i-m];
for (let j = (i-n)+1; j < i; j++) {
out[i] -= (
j < 0 ? 1 :
j < borderGuess.length ? borderGuess[j] :
out[j]
);
}
}
return out;
}
``````

The SmartDeblur tool is designed for real-world, arbitrary photos. But we can probably recover a much better image if we know that the source is text! Usually, blurred text is given as part of a larger unblurred image, from which we can make very strong assumptions about the blurred source. For instance, we can be confident that the border of the source is white. We can assume the source is black and white, rather than greyscale. In the extreme, we could assume that the text is 12-point Times New Roman, and recover the source text by generating characters that minimize error. A demo of this would be a fun future blogpost ...

What can computers do? What are the limits of mathematics? And just how busy can a busy beaver be? This year, I’m writing Busy Beavers, a unique interactive book on computability theory. You and I will take a practical and modern approach to answering these questions — or at least learning why some questions are unanswerable!

It’s only \$19, and you can get 50% off if you find the discount code ... Not quite. Hackers use the console!

After months of secret toil, I and Andrew Carr released Everyday Data Science, a unique interactive online course! You’ll make the perfect glass of lemonade using Thompson sampling. You’ll lose weight with differential equations. And you might just qualify for the Olympics with a bit of statistics!

It’s \$29, but you can get 50% off if you find the discount code ... Not quite. Hackers use the console!

### More by Jim

Tagged #programming, #security. All content copyright James Fisher 2020. This post is not associated with my employer.