Vision Tutorial for iOS: What’s New With Face Detection?

Learn what’s new with Face Detection and how the latest additions to Vision framework can help you achieve better results in image segmentation and analysis. By Tom Elliott.

5 (2) · 1 Review

Download materials
Save for later
You are currently viewing page 4 of 5 of this article. Click here to view the first page.

Building Better Backgrounds

The Hide Background button is above the debug button on the left-hand side of the camera controls. Toggling the button on does nothing — yet. :]

Open FaceDetector.swift. Find captureOutput(_:didOutput:from:). Underneath the declaration of detectCaptureQualityRequest, add a new face segmentation request:

// 1
let detectSegmentationRequest = VNGeneratePersonSegmentationRequest(completionHandler: detectedSegmentationRequest)
// 2
detectSegmentationRequest.qualityLevel = .balanced

Here, you:

  1. Create the segmentation request. Call the detectedSegmentationRequest method upon completion.
  2. There are three quality levels available for a VNGeneratePersonSegmentationRequest: accurate, balanced and fast. The faster the algorithm runs, the lower the quality of the mask produced. Both fast and balanced quality levels run quick enough to be used with video data. The accurate quality level requires a static image.

Next, update the call to the sequenceHandler's perform(_:on:orientation:) method to include the new segmentation request:

[detectFaceRectanglesRequest, detectCaptureQualityRequest, detectSegmentationRequest],

Handling the Segmentation Request Result

Then add the following to detectedSegmentationRequest(request:error:):

// 1
  let model = model,
  let results = request.results as? [VNPixelBufferObservation],
  let result = results.first,
  let currentFrameBuffer = currentFrameBuffer
else {

// 2
if model.hideBackgroundModeEnabled {
  // 3
  let originalImage = CIImage(cvImageBuffer: currentFrameBuffer)
  let maskPixelBuffer = result.pixelBuffer
  let outputImage = removeBackgroundFrom(image: originalImage, using: maskPixelBuffer)
  viewDelegate?.draw(image: outputImage.oriented(.upMirrored))
} else {
  // 4
  let originalImage = CIImage(cvImageBuffer: currentFrameBuffer).oriented(.upMirrored)
  viewDelegate?.draw(image: originalImage)

In this code, you:

  1. Pull out the model, the first observation and the current frame buffer, or return early if any are nil.
  2. Query the model for the state of the background hiding mode.
  3. If hiding, create a core image representation of the original frame from the camera. Also, create a mask of the person segmentation result. Then, use those two to create an output image with the background removed.
  4. Otherwise, the original image is recreated without change when not hiding the background.

In either case, a delegate method on the view is then called to draw the image in the frame. This will use the Metal pipeline discussed in the previous section.

Removing the Background

Replace the implementation of removeBackgroundFrom(image:using:):

// 1
var maskImage = CIImage(cvPixelBuffer: maskPixelBuffer)

// 2
let originalImage = image.oriented(.right)

// 3.
let scaleX = originalImage.extent.width / maskImage.extent.width
let scaleY = originalImage.extent.height / maskImage.extent.height
maskImage = maskImage.transformed(by: .init(scaleX: scaleX, y: scaleY)).oriented(.upMirrored)

// 4
let backgroundImage = CIImage(color: .white).clampedToExtent().cropped(to: originalImage.extent)

// 5
let blendFilter = CIFilter.blendWithRedMask()
blendFilter.inputImage = originalImage
blendFilter.backgroundImage = backgroundImage
blendFilter.maskImage = maskImage

// 6
if let outputImage = blendFilter.outputImage?.oriented(.left) {
  return outputImage

// 7
return originalImage

Here, you:

  1. Create a core image of the mask using the segmentation mask from the pixel buffer.
  2. Then, you rotate the original image to the right. The segmentation mask results are rotated by 90 degrees relative to the camera. Thus, you need to align the image and the mask before blending.
  3. Similarly, the mask image isn't the same size as the video frame pulled straight from the camera. So, scale the mask image to fit.
  4. Next, create a pure-white image the same size as the original image. clampedToExtent() creates an image with infinite width and height. You then crop it to the size of the original image.
  5. Now comes the actual work. Create a core image filter that blends the original image with the all-white image. Use the segmentation mask image as the mask.
  6. Finally, re-rotate the output from the filter left and return
  7. Or, return the original image if blended image couldn't be created.

Build and run. Toggle the Hide Background button on and off. Watch as the background around your body disappears.

Person segmentation

Saving the Picture

Your passport photo app is almost complete!

One more task remains — taking and saving a photo. Start by opening CameraViewModel.swift and adding a new published property underneath isAcceptableQuality property declaration:

@Published private(set) var passportPhoto: UIImage?

passportPhoto is an optional UIImage that represents the last photo taken. It's nil before the first photo is taken.

Next, add two more actions as cases to CameraViewModelAction enum:

case takePhoto
case savePhoto(UIImage)

The first action performs when the user presses the shutter button. The second action performs after processing the image when it's ready to save to the camera roll.

Next, add handlers for the new actions to the end of the switch statement in perform(action:):

case .takePhoto:
case .savePhoto(let image):

Then, add the implementation to takePhoto() method. This one is very simple:


shutterReleased is a Combine PassthroughSubject that publishes a void value. Any part of the app holding a reference to the view model can subscribe to an event of the user releasing the shutter.

Add the implementation of savePhoto(_:), which is nearly as simple:

// 1
UIImageWriteToSavedPhotosAlbum(photo, nil, nil, nil)
// 2
DispatchQueue.main.async { [self] in
  // 3
  passportPhoto = photo

Here, you:

  1. Write the provided UIImage to the photo album on your phone.
  2. Dispatch to the main thread as needed for all the UI operations.
  3. Set the current passport photo to the photo passed into the method.

Next, open CameraControlsFooterView.swift and wire up the controls. Replace the print("TODO") in the ShutterButton action closure with the following:

model.perform(action: .takePhoto)

This tells the view model to perform the shutter release.

Then, update the ThumbnailView to show the passport photo by passing it from the model:

ThumbnailView(passportPhoto: model.passportPhoto)

Finally, open FaceDetector.swift and make the necessary changes to capture and process the photo data. First, add a new property to the class after defining the currentFrameBuffer property:

var isCapturingPhoto = false

This flag indicates that the next frame should capture a photo. You set this whenever the view model's shutterReleased property publishes a value.

Find the weak var model: CameraViewModel? property and update it like so:

weak var model: CameraViewModel? {
  didSet {
    // 1
    model?.shutterReleased.sink { completion in
      switch completion {
      case .finished:
      case .failure(let error):
        print("Received error: \(error)")
    } receiveValue: { _ in
      // 2
      self.isCapturingPhoto = true
    .store(in: &subscriptions)

Here, you:

  1. Observe updates to model's shutterReleased property after it's set.
  2. Set the isCapturingPhoto property to true when the shutter gets released.