Video Depth Maps Tutorial for iOS: Getting Started

In this iOS video depth maps tutorial, you’ll harness iOS 13’s video depth maps to apply realtime video filters and create a special effects masterpiece! By Owen L Brown.

Leave a rating/review
Download materials
Save for later
You are currently viewing page 2 of 3 of this article. Click here to view the first page.

Video Resolutions And Frame Rates

There are a couple of things you should know about the depth data you’re capturing. It’s a lot of work for your iPhone to correlate the pixels between the two cameras and calculate the disparity.

Note: Confused by that last sentence? Check out the Image Depth Maps Tutorial for iOS: Getting Started. It has a great explanation in the section, How Does The iPhone Do This?

To provide you with the best real-time data it can, the iPhone limits the resolutions and frame rates of the depth data it returns.

For instance, the maximum amount of depth data you can receive on an iPhone 7 Plus is 320 x 240 at 24 frames per second. The iPhone X is capable of delivering that data at 30 fps.

AVCaptureDevice doesn’t let you set the depth frame rate independent of the video frame rate. Depth data must be delivered at the same frame rate or an even fraction of the video frame rate. Otherwise, a situation would arise where you have depth data but no video data, which is strange.

Because of this, you need to:

  1. Set your video frame rate to ensure the maximum possible depth data frame rate.
  2. Determine the scale factor between your video data and your depth data. The scale factor is important when you start creating masks and filters.

Time to make your code better!

Again in DepthVideoViewController.swift, add the following to the bottom of configureCaptureSession:

// 1
let outputRect = CGRect(x: 0, y: 0, width: 1, height: 1)
let videoRect = videoOutput
  .outputRectConverted(fromMetadataOutputRect: outputRect)
let depthRect = depthOutput
  .outputRectConverted(fromMetadataOutputRect: outputRect)

// 2
scale =
  max(videoRect.width, videoRect.height) /
  max(depthRect.width, depthRect.height)

// 3
do {
  try camera.lockForConfiguration()

  // 4
  if let format = camera.activeDepthDataFormat,
    let range = format.videoSupportedFrameRateRanges.first  {
    camera.activeVideoMinFrameDuration = range.minFrameDuration

  // 5
} catch {

Here’s the breakdown:

  1. First, calculate a CGRect that defines the video and depth output in pixels. The methods map the full metadata output rect to the full resolution of the video and data outputs.
  2. Using the CGRect for both video and data output, you calculate the scaling factor between them. You take the maximum of the dimension because the depth data is actually delivered rotated by 90 degrees.
  3. While you’re changing the AVCaptureDevice configuration, you need to lock it. That can throw an error.
  4. Then, set the AVCaptureDevice‘s minimum frame duration, which is the inverse of the maximum frame rate, to be equal to the supported frame rate of the depth data.
  5. Finally, unlock the configuration you locked in step three.

Build and run the project. Whether or not you see a difference, your code is now more robust and future-proof. :]

What Can You Do With This Depth Data?

Well, much like in Image Depth Maps Tutorial for iOS: Getting Started, you can use this depth data to create a mask. You can use the mask to filter the original video feed.

The mask is a black and white image with values from 0 to 1. To filter the video input, you’ll blend each video frame with a filter CIImage, according to the mask. The blending happens by multiplying each pixel of the filter image with a mask pixel at the same location. If the masks’ pixel value is 0.0, the resulting pixel isn’t filtered. If it’s 1.0, that pixel is completely filtered.

You may have noticed a slider at the bottom of the screen for the Mask and Filtered segments. This slider controls the depth focus of the mask.

Currently, that slider seems to do nothing. That’s because there’s no visualization of the mask on the screen. You’re going to change that now!

Go back to depthDataOutput(_:didOutput:timestamp:connection:) in the AVCaptureDepthDataOutputDelegate extension. Just before DispatchQueue.main.async, add the following:

if previewMode == .mask || previewMode == .filtered {
  switch filter {
    mask = depthFilters.createHighPassMask(
      for: depthMap,
      withFocus: sliderValue,
      andScale: scale)

First, only create a mask if the Mask or the Filtered segments are active. Then, switch on the type of filter selected. You’ll find those at the top of the iPhone screen. For now, create a high pass mask as the default case. You’ll fill out other cases soon.

Note: The starter project includes a high pass and a band-pass mask. These are similar to the ones created in Image Depth Maps Tutorial for iOS: Getting Started under the section Creating a Mask.

You still need to hook the mask up to the image view to see it. Go back to the AVCaptureVideoDataOutputSampleBufferDelegate extension and look for the switch statement in captureOutput(_:didOutput:from:). Add the following case:

case (.mask, _, let mask?):
  previewImage = mask

Build and run the project. Tap the Mask segment.

video depth maps high pass mask

As you drag the slider to the left more of the screen turns white. That’s because you implemented a high pass mask.

Good job! You laid the groundwork for the most exciting part of this tutorial: the filters!

Comic Background Effect

The iOS SDK comes bundled with a bunch of Core Image filters. One that particularly stands out is CIComicEffect. This filter gives an image a printed comic look.

core image comic filter off
core image comic filter on

You’re going to use this filter to turn the background of your video stream into a comic.

Open DepthImageFilters.swift. This class is where all your masks and filters go.

Add the following method to the DepthImageFilters class:

func comic(image: CIImage, mask: CIImage) -> CIImage {
  // 1
  let bg = image.applyingFilter("CIComicEffect")
  // 2
  let filtered = image.applyingFilter("CIBlendWithMask", parameters: [
    "inputBackgroundImage": bg,
    "inputMaskImage": mask
  // 3
  return filtered

To break it down:

  1. Apply the CIComicEffect to the input image.
  2. Then blend the original image with the comic image using the input mask.
  3. Finally, return the filtered image.

Now, to use the filter, open DepthVideoViewController.swift, find captureOutput(_:didOutput:from:) and add the following case:

case (.filtered, .comic, let mask?):
  previewImage = depthFilters.comic(image: image, mask: mask)

Before you run the code, there’s one more thing you need to do to make adding future filters easier.

Find depthDataOutput(_:didOutput:timestamp:connection:) and add the following case to the switch filter statement:

case .comic:
  mask = depthFilters.createHighPassMask(
    for: depthMap,
    withFocus: sliderValue,
    andScale: scale)

Here, you create a high pass mask.

This looks exactly the same as the default case. You’ll remove the default case after you add the other filters, so it’s best to make sure the comic case is in there now.

Go ahead. I know you’re excited to run this. Build and run the project and tap the Filtered segment.

Build and run with comic filter

Fantastic work! Do you feel like a superhero in a comic book?