BlazeFace hello world
BlazeFace is a neural network model that detects faces in images. It’s designed to be fast, to run at 30fps on mobile GPUs. There is a TensorFlow.js library for BlazeFace, which downloads the model, runs it in WebGL using TensorFlow.js, and wraps the raw model input/output with a friendly, semantic API. to start a demo, which captures and displays your webcam, runs BlazeFace against frames as often as possible, and draws the detected face landmarks on top of your webcam stream:
Here’s what I get when I run it against my own face:
Basic usage of this API is basically one function call; a pure function from input image to predicted faces:
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.4"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/blazeface@0.0.5"></script>
const model = await blazeface.load();
const predictions = await model.estimateFaces(webcamVideoEl, false);
console.log(predictions);
This will log an object looking something like:
[
{ // One detected face
topLeft: [162.84341430664062,153.98446655273438], // [x,y] coordinates
bottomRight: [422.3966979980469,348.6485900878906],
landmarks:[
[238.96787643432617,204.8737621307373], // right eye
[328.2145833969116,205.4714870452881], // left eye
[273.6716037988663,252.84512042999268], // nose
[280.01041051000357,293.8540989160538], // mouth
[206.6525173187256,226.03596210479736], // right ear
[386.57989501953125,226.02698922157288] // left ear
],
probability: [0.9997807145118713] // Always a one-element array; a bit odd
}
]
The topLeft
and bottomRight
coordinates define a “bounding box”,
but I don’t know exactly what it’s supposed to bound.
Certainly, it seems to always contain the six detected landmarks,
but not precisely.
My eventual goal is to draw a boundary around the head;
the default bounding box is not necessarily helpful for this.
Under certain conditions, BlazeFace consistently recognized a face in my forehead, and was extremely confident about it:
Probably one of my weirder debugging sessions pic.twitter.com/AubMIkM1kI
— Jim Fisher (@MrJamesFisher) September 20, 2020
The bug seems to only happen when I use my high-resolution webcam feed. BlazeFace performs much more reliably with a lower-resolution webcam feed. This is very strange, because I believe the library resizes the input to 128×128 pixels before analyzing it. I’ll do a future post on the internals of this library, and how to use TensorFlow.js. This should help understand the weird forehead bug.
Tagged #programming, #web, #tensorflow, #ml. All content copyright James Fisher 2020. This post is not associated with my employer.