Learn more about Russian war crimes in Ukraine.

Using BodyPix segmentation in a WebGL shader

In the previous post I showed how to run BodyPix on a video stream, displaying the segmentation using the library’s convenience functions. But if you want to use the segmentation as part of your WebGL rendering pipeline, you need to access the segmentation from your shader. In this post, I demo a pixel shader that sets the alpha channel of a canvas based on a BodyPix segmentation. The demo shows your webcam feed in the bottom-right corner of this page with alpha-transparency taken from BodyPix.

A call to net.segmentPerson returns something like this:

{
  allPoses: [...],
  data: Uint8Array(307200) [...],
  height: 480,
  width: 640,
}

There is one byte for each pixel: note 640*480 == 307200. These are in row-major order, so pixel (x,y) is at y*640 + x, where (0,0) is the top-left of the image. For example, here’s a silly debugging function that renders the segmentation in the console:

function renderSegmentation(segmentation) {
  let s = "";
  const xStride = Math.max(1, Math.floor(segmentation.width/30));   // ~30 wide
  const yStride = xStride*2; // chars are ~twice as tall as they are wide
  for (let y = 0; y < segmentation.height; y += yStride) {
    for (let x = 0; x < segmentation.width; x += xStride) {
      s += segmentation.data[segmentation.width*y + x] == 1 ? "X" : " ";
    }
    s += "\n";
  }
  console.log(s);
}

It will give you output like this if you wave at the camera:

            XXXXX    XX        
           XXXXXXX  XXXXX      
           XXXXXXX  XXXXXX     
           XXXXXXX   XXXXX     
            XXXXX     XXXXX    
           XXXXXX      XXXXX   
       XXXXXXXXXXXXXXX   XXXX  
   XXXXXXXXXXXXXXXXXXXXX XXXXX 
  XXXXXXXXXXXXXXXXXXXXXXXXXXXXX

To access this data in a WebGL shader, we need to get it into a texture using gl.texImage2D. When you pass an array to gl.texImage2D, you tell it which format to interpret it as. One possible format is gl.ALPHA, which has one byte per pixel -- the same as the format given to us by BodyPix. This byte interpreted as the alpha channel when the texture is accessed by a shader. Here’s how to load the segmentation data into a texture:

gl.texImage2D(
  gl.TEXTURE_2D,        // target 
  0,                    // level
  gl.ALPHA,             // internalformat
  segmentation.width,   // width
  segmentation.height,  // height
  0,                    // border, "Must be 0"
  gl.ALPHA,             // format, "must be the same as internalformat"
  gl.UNSIGNED_BYTE,     // type of data below
  segmentation.data     // pixels
);

Unfortunately, the byte values given by BodyPix are 0 and 1, rather than the ideal 0 and 255. But we can correct for this in our fragment shader:

precision mediump float;

uniform sampler2D frame;
uniform sampler2D mask;

uniform float texWidth;
uniform float texHeight;

void main(void) {
  vec2 texCoord = vec2(gl_FragCoord.x/texWidth, 1.0 - (gl_FragCoord.y/texHeight));
  gl_FragColor = vec4(texture2D(frame, texCoord).rgb, texture2D(mask, texCoord).a * 255.);
}

Here’s what I get when I run the demo against my own webcam feed:

As you can see, BodyPix still has a number of quality issues. In priority order:

  1. BodyPix doesn’t realize my body extends beyond the bottom of the image. It might be possible to improve this by fudging the input or output.
  2. It’s really bad at recognizing fingers. It might be possible to improve this by running Handpose on the detected palms.

What can computers do? What are the limits of mathematics? And just how busy can a busy beaver be? This year, I’m writing Busy Beavers, a unique interactive book on computability theory. You and I will take a practical and modern approach to answering these questions — or at least learning why some questions are unanswerable!

It’s only $19, and you can get 50% off if you find the discount code ... Not quite. Hackers use the console!

After months of secret toil, I and Andrew Carr released Everyday Data Science, a unique interactive online course! You’ll make the perfect glass of lemonade using Thompson sampling. You’ll lose weight with differential equations. And you might just qualify for the Olympics with a bit of statistics!

It’s $29, but you can get 50% off if you find the discount code ... Not quite. Hackers use the console!

More by Jim

Tagged #programming, #web, #webgl, #ml. All content copyright James Fisher 2020. This post is not associated with my employer. Found an error? Edit this page.