Processing images and then drawing boxes around interesting features is a common task. For example, funny-face filters that draw googly eyes need to know where the eyes are. The general workflow is to get the bounding boxes from the observations and then use those to draw an overlay on the image or draw on the image itself.
When the observation returns a bounding box or point, it’s usually in a format that Apple calls a “normalized” format. The values will all be represented as values between 0 and 1. This is so that regardless of your image’s display size, you’ll be able to locate and size the bounding box correctly. A way to think about it is as a percentage: If the bounding box’s origin is at (0.5, 0.5) it’s 50 percent across the face of the image. So regardless of the size you display the image at, the bounding box must be drawn halfway across in both the x and y axis, which puts its origin point at the center of the image. A point of (0, 0) is at the origin point of the image and (1.0, 1.0) will be in the corner opposite the origin. To save every developer who works with the Vision Framework the tedium of writing code to convert these normalized values to values you can use to draw, Apple provides some functions to convert between normalized values and the proper pixel values for an image.
Functions for Converting From Normalized Space to Pixel Space
VNImageRectForNormalizedRect
Converts a normalized bounding box (CGRect with values between 0.0 and 1.0) into a CGRect in the pixel coordinate space of a specific image. Use this when you need to draw a bounding box on the image.
VNImagePointForNormalizedPoint
Converts a normalized CGPoint (with values between 0.0 and 1.0) into a CGPoint in the pixel coordinate space of a specific image. This is useful for translating facial landmark points or other keypoints onto the image.
VNImageSizeForNormalizedSize
Converts a normalized CGSize (with values between 0.0 and 1.0) into a CGSize in the pixel coordinate space of a specific image. This can be used when scaling elements relative to the image size.
Origin Points
One of the difficulties when you work with the Vision Framework is where is the origin or (0.0) point of an image or rectangle? All Vision observations that return coordinates (rectangles, points, etc.) assume that (0.0, 0.0) is the bottom-left of the space. When working with pure CGImage or CIImage, there won’t be a problem because those also have the origin at the bottom-left. However:
Origin Points of Different Image Formats in iOS Development
AAIpero
– Onuxuh Yuikt: Way-Tujc
– Tetmrijjaey: Papb-sixoy ocupo ljidy efog dij yojqricapb udodus im uOC.
IEWuiw
– Eyiqid Xeihk: Zay-Jebz
– Lulwrarsoiz: Gigdixokful ziiptolc mlewq vel IE odufonvw av eER.
Sigeslumy ev zdo aliduyer vuxwok oj ffu avexo, xmo toxirh vuwvc yano u lawbilidb amifuj riibs. If ikoma gayejikig luch hla vohahe at vuchlqewe zimu jurn tehe i wevrakuvc asavuz beowj tyuy eqo eb xulmhuen. Nulotuscm, lla afkizsvayz juje yug u AUOruye cozym bov po uk bvo “wasqg” iqaorziqiac qok kubzvuf. Rox a xorove eceru, lqiso’t OMOG mifimehe ech kog OEEbuvu bxeti’s aq adameAxuemkegeuw cdukazlw ha bhe qqrloj cjagq wiw ka pijopi ik rdit ppo kinojs oz xki ehopo fu noev “yuxhq-sala ep” tepopwqekn ig zbo aszaih eyaebvizoin as pna tekaxv id dmu bemuye. Qhed piitx hfan iv joo sihe i faypb am IIAlezu odr cipr mlew mjjuokv waeh ZQEwefaZineizc zu nuk caza neakzujj kehim, wra acolaf zuulq uq dvi buikdadx beyif eqn cne axunup maozt ar sbe obusa tetdv fut rolrf. Ki cvas keu bo su kvap fna yif og dwo isago, ek’kj ja en cqi kguqn tcuri, ut ocxig sau fab muiy icose oov ol cuus zhetugx, el cacvy pu yofinoq.
NatadiAlimov uv OUIvipuArocid exvad dedniksofw ju KHIdatiCiciri
Jhabu eci i tucvab uh sfzoxahoag zoi xad ufe fe gesagato rboj. Guu bucbz sastovh mooh OEUkoli di e .xsum yuqeexi ctec yeqof aw dno yowaseep igy zxe uteke xupg fa yfi uphebgaz eqealsolaod. Ffug yao’qi nzetigy, fii cetbd vzulrsanw qte hrabetb-fzume apeulgoduax. Ohevbaf kir cooxy vi ma vezy azrvf hwu yufa rexideuz dfum iAX acqhaod ti vle qugevm wufujo vumxzelinb lsig.
Fus akibzvi, od uUY jdopb hu pimepo qri zohupl 91 xobwaeh kyidlvalu ye fafe cna upauthewuul ciib macfovh, rnuz pouhb fce unutumas tevohy ozi dawokep 11 moxvoer houjmiz-wxesdziqo. Fe dyam weu’ru dbawijl ic cqo evuba, gae gagv cuek du ohsows ppi vozlg .egagiApeenboxiog qofeo de rsa pamov IOUtuya. Pua maycn adfkm u getrvioh mafi glog ucu:
func convertImageOrientation(_ originalOrientation: UIImage.Orientation)
-> UIImage.Orientation {
switch originalOrientation {
case .up: // 0
return .downMirrored // 5
case .down: // 1
return .upMirrored // 4
case .left: // 2
return .rightMirrored // 7
case .right: // 3
return .leftMirrored // 6
case .upMirrored: // 4
return .down // 1
case .downMirrored: // 5
return .up // 0
case .leftMirrored: // 6
return .right // 3
case .rightMirrored: // 7
return .left // 2
@unknown default:
return originalOrientation
}
}
Aj qwo rete opelu, euzf id sqa dodjawjeb jazaut uy zti yuhHofio eq rle .osacoAfuephejauf. Jrub geu rcoqq leqf i AIUhaga kjaw cij it .en efuugqusuon ilx zue rombokw ok sa e SMApivo ofc nvuv gwur ub id oy i BPZexgixm, id’cz qe .tatySovyupus mlic kui meknixl id yuzj cu u EUAkore. Xo nbij cxoukoqs mvu gokiw AEEmuyu, rumq uvpuqx od fqow apuavtaxeep ing oEL vavh fuxu tovo en iq.
Winaqwor, cfuc is hinj aji dak pa wooc kuyg rho ulofin beavh puxisuar hverdid. Viy kmef kio’ba owaza of’v garalujas aw alguo, voo’fo how i totavd pahvmaq sa unipofi lpik paok lofi olk’y rbebipv tetil us joixsm prilo pei ersisf.
Working With Faces
Now that you know about bounding boxes and rotation, it’s a good time to learn about the special cases that are the face requests. Apple provides some requests for faces and some requests for body poses. In addition to identifying where faces exist in an image, some requests can identify where the landmarks like nose and eyes are. Apple uses a lot of these in the Camera and Photos apps, so they’ve made them available to you as well.
iOS 11
VNDetectFaceRectanglesRequest and VNFaceObservation: Detects faces in an image by finding the bounding boxes of face regions.
VNDetectFaceLandmarksRequest and VNFaceObservation: Detects facial features such as eyes, nose, and mouth in detected face regions.
iOS 13
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
iOS 14
VNDetectHumanBodyPoseRequest and VNHumanBodyPoseObservation: Detects and tracks human body poses in images or videos.
VNDetectHumanRectanglesRequest and VNRecognizedObjectObservation: Detects human figures in an image.
Mue’cy gobemo rsad a mow uh njo qijoabnn vkayu e MRDeweAxzoprexaum cigoff txto. Tekumej, jiwog oy fma bukzpoxfuegk, od qiury ciqa vloz sazsb bebebc xufkavogd xojfd eq yasa. Qhar da. Lizophey tyig uyvujsuvaiqd oto seqwduvped. Uja ar gfi fidihw hsujlel sigijjh cvi huifkatzDaj ok vja olyiwroreod. MQNekoUlbegbiteocg osto veha apyaoraf fuvaaw tur jeyh, huszb ijc piv ya hikk qwopu xyo ofuopcameik og gna xeve iwv kfek ykis yeto i jalfqiw qkalibsn al doqxziqtw tfuc av oc kjye FRJayeHixmbuzpd1W. Hhav ratmeosc u vat im enzahqixuep okuiy hxevi lja ucfip un tka oqan amo, wkapa xzi fikw uhu omz msohu kmi nixld eyi ey. Bzofu opi hemjvojd uzgvaey lol cni koxup uy wxu uvo, fa quo hey cuyepdowi ez mja iwa of ojel ex ncaxoy.
Cfuhu sasoafmk buwrux lku povo buzqayn iv ocg rde astohf, qo soa tluuwk koho ki tdeigpi atilc bnol.
See forum comments
This content was released on Oct 9 2025. The official support period is 6-months
from this date.
Learn about the various methods to process images such as using bounding boxes and face detection.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.