Vision Tutorial for iOS: What’s New With Face Detection?

Learn what’s new with Face Detection and how the latest additions to Vision framework can help you achieve better results in image segmentation and analysis. By Tom Elliott.

5 (2) · 1 Review

Download materials
Save for later
You are currently viewing page 2 of 5 of this article. Click here to view the first page.

Processing Faces

Next, update processUpdatedFaceGeometry() by replacing the faceFound() case with the following:

case .faceFound(let faceGeometryModel):
  let roll = faceGeometryModel.roll.doubleValue
  let pitch = faceGeometryModel.pitch.doubleValue
  let yaw = faceGeometryModel.yaw.doubleValue
  updateAcceptableRollPitchYaw(using: roll, pitch: pitch, yaw: yaw)

Here you pull the roll, pitch and yaw values for the detected face as doubles from the faceGeometryModel. You then pass these values to updateAcceptableRollPitchYaw(using:pitch:yaw:).

Now add the following into the implementation stub of updateAcceptableRollPitchYaw(using:pitch:yaw:):

isAcceptableRoll = (roll > 1.2 && roll < 1.6)
isAcceptablePitch = abs(CGFloat(pitch)) < 0.2
isAcceptableYaw = abs(CGFloat(yaw)) < 0.15

Here, you set the state for acceptable roll, pitch and yaw based on the values pulled from the face.

Finally, replace calculateDetectedFaceValidity() to use the roll, pitch and yaw values to determine if the face is valid:

hasDetectedValidFace =
  isAcceptableRoll &&
  isAcceptablePitch &&

Now, open FaceDetector.swift. In detectedFaceRectangles(request:error:), replace the definition of faceObservationModel with the following:

let faceObservationModel = FaceGeometryModel(
  boundingBox: convertedBoundingBox,
  roll: result.roll ?? 0,
  pitch: result.pitch ?? 0,
  yaw: result.yaw ?? 0

This simply adds the now required roll, pitch and yaw parameters to the initialization of the FaceGeometryModel object.

Debug those Faces

It would be nice to add some information about the roll, pitch and yaw to the debug view so that you can see the values as you're using the app.

Open DebugView.swift and replace the DebugSection in the body declaration with:

DebugSection(observation: model.faceGeometryState) { geometryModel in
  DebugText("R: \(geometryModel.roll)")
    .debugTextStatus(status: model.isAcceptableRoll ? .passing : .failing)
  DebugText("P: \(geometryModel.pitch)")
    .debugTextStatus(status: model.isAcceptablePitch ? .passing : .failing)
  DebugText("Y: \(geometryModel.yaw)")
    .debugTextStatus(status: model.isAcceptableYaw ? .passing : .failing)

This has updated the debug text to print the current values to the screen and set the text color based on if the value is acceptable or not.

Build and run.

Look straight at the camera and note how the oval is green. Now rotate your head from side to side and note how the oval turns red when you aren't looking directly at the camera. If you have debug mode turned on, notice how the yaw number changes both value and color as well.

The oval is green when the user is facing forward

The oval is red when the user is not facing forward

Selecting a Size

Next, you want the app to detect how big or small a face is within the frame of the photo. Open CameraViewModel.swift and add the following property under isAcceptableYaw declaration:

@Published private(set) var isAcceptableBounds: FaceBoundsState {
  didSet {

Then, set the initial value for this property at the bottom of init():

isAcceptableBounds = .unknown

As before add the following to the end of invalidateFaceGeometryState():

isAcceptableBounds = .unknown

Next, in processUpdatedFaceGeometry(), add the following to the end of the faceFound case:

let boundingBox = faceGeometryModel.boundingBox
updateAcceptableBounds(using: boundingBox)

Then fill in the stub of updateAcceptableBounds(using:) with the following code:

// 1
if boundingBox.width > 1.2 * faceLayoutGuideFrame.width {
  isAcceptableBounds = .detectedFaceTooLarge
} else if boundingBox.width * 1.2 < faceLayoutGuideFrame.width {
  isAcceptableBounds = .detectedFaceTooSmall
} else {
  // 2
  if abs(boundingBox.midX - faceLayoutGuideFrame.midX) > 50 {
    isAcceptableBounds = .detectedFaceOffCentre
  } else if abs(boundingBox.midY - faceLayoutGuideFrame.midY) > 50 {
    isAcceptableBounds = .detectedFaceOffCentre
  } else {
    isAcceptableBounds = .detectedFaceAppropriateSizeAndPosition

With this code, you:

  1. First check to see if the bounding box of the face is roughly the same width as the layout guide.
  2. Then, check if the bounding box of the face is roughly centered in the frame.

If both these checks pass, isAcceptableBounds is set to FaceBoundsState.detectedFaceAppropriateSizeAndPosition. Otherwise, it is set to the corresponding error case.

Finally, update calculateDetectedFaceValidity() to look like this:

hasDetectedValidFace =
  isAcceptableBounds == .detectedFaceAppropriateSizeAndPosition &&
  isAcceptableRoll &&
  isAcceptablePitch &&

This adds a check that the bounds are acceptable.

Build and run. Move the phone toward and away from your face and note how the oval changes color.

The user is too far away

The user is properly distanced

Detecting Differences

Currently, the FaceDetector is detecting face rectangles using VNDetectFaceRectanglesRequestRevision2. iOS 15 introduced a new revision, VNDetectFaceRectanglesRequestRevision3. So what's the difference?

Version 3 provides many useful updates for detecting face rectangles, including:

  1. The pitch of the detected face is now determined. You may not have noticed, but the value for the pitch so far was always 0 because it wasn't present in the face observation.
  2. Roll, pitch and yaw values are reported in continuous space. With VNDetectFaceRectanglesRequestRevision2, the roll and yaw was provided within discrete bins only. You can observe this yourself using the app and rolling your head from side to side. The yaw always jumps between 0 and ±0.785 radians.
  3. When detecting face landmarks, the location of the pupils is accurately detected. Previously, the pupils would be set to the center of the eyes even when looking out to the side of your face.

Time to update the app to use VNDetectFaceRectanglesRequestRevision3. You'll make use of detected pitch and observe the continuous space updates.

Open FaceDetector.swift. In captureOutput(_:didOutput:from:), update the revision property of detectFaceRectanglesRequest to revision 3:

detectFaceRectanglesRequest.revision = VNDetectFaceRectanglesRequestRevision3

Build and run.

Hold your phone up to your face. Note how the values printed in the debug output update on every frame. Pitch your head (look up to the ceiling, and down with your chin on your chest). Note how the pitch numbers also update.

The user is looking down

Masking Mayhem

Unless you've been living under a rock, you must have noticed that more and more people are wearing masks. This is great for fighting COVID, but terrible for face recognition!

Luckily, Apple has your back. With VNDetectFaceRectanglesRequestRevision3, the Vision framework can now detect faces covered by masks. While this is nice for general-purpose face detection, it's a disaster for your passport photos app. Wearing a mask is absolutely not allowed in your passport photo! So how then should you prevent people who are wearing masks from taking photos?

Luckily for you, Apple has also improved face capture quality. Face capture quality provides a score for a detected face. It takes into account attributes like lighting, occlusion (like masks!), blur, etc.

Please note that quality detection compares the same subject against copies of themselves. It does not compare one person against another. Capture quality varies between 0 to 1. The latest revision in iOS 15 is VNDetectFaceCaptureQualityRequestRevision2.