BlazeFace hello world

BlazeFace is a neural network model that detects faces in images. It’s designed to be fast, to run at 30fps on mobile GPUs. There is a TensorFlow.js library for BlazeFace, which downloads the model, runs it in WebGL using TensorFlow.js, and wraps the raw model input/output with a friendly, semantic API. to start a demo, which captures and displays your webcam, runs BlazeFace against frames as often as possible, and draws the detected face landmarks on top of your webcam stream:

Here’s what I get when I run it against my own face:

Basic usage of this API is basically one function call; a pure function from input image to predicted faces:

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.4"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/blazeface@0.0.5"></script>
const model = await blazeface.load();
const predictions = await model.estimateFaces(webcamVideoEl, false);
console.log(predictions);

This will log an object looking something like:

[
  { // One detected face
    topLeft: [162.84341430664062,153.98446655273438],  // [x,y] coordinates
    bottomRight: [422.3966979980469,348.6485900878906],
    landmarks:[
      [238.96787643432617,204.8737621307373], // right eye
      [328.2145833969116,205.4714870452881],  // left eye
      [273.6716037988663,252.84512042999268], // nose
      [280.01041051000357,293.8540989160538], // mouth
      [206.6525173187256,226.03596210479736], // right ear
      [386.57989501953125,226.02698922157288] // left ear
    ],
    probability: [0.9997807145118713]  // Always a one-element array; a bit odd
  }
]

The topLeft and bottomRight coordinates define a “bounding box”, but I don’t know exactly what it’s supposed to bound. Certainly, it seems to always contain the six detected landmarks, but not precisely. My eventual goal is to draw a boundary around the head; the default bounding box is not necessarily helpful for this.

Under certain conditions, BlazeFace consistently recognized a face in my forehead, and was extremely confident about it:

Probably one of my weirder debugging sessions pic.twitter.com/AubMIkM1kI

— Jim Fisher (@MrJamesFisher) September 20, 2020

The bug seems to only happen when I use my high-resolution webcam feed. BlazeFace performs much more reliably with a lower-resolution webcam feed. This is very strange, because I believe the library resizes the input to 128×128 pixels before analyzing it. I’ll do a future post on the internals of this library, and how to use TensorFlow.js. This should help understand the weird forehead bug.

I just released Vidrio, a free app for macOS and Windows to make your screen-sharing awesomely holographic. Vidrio shows your webcam video on your screen, just like a mirror. Then you just share or record your screen with Zoom, QuickTime, or any other app. Vidrio makes your presentations effortlessly engaging, showing your gestures, gazes, and expressions. #1 on Product Hunt. Available for macOS and Windows.

With Vidrio

With generic competitor

More by Jim

Tagged #programming, #web, #tensorflow, #ml. All content copyright James Fisher 2020. This post is not associated with my employer. Found an error? Edit this page.