Image Processing in iOS Part 1: Raw Bitmap Modification

Learn the basics of image processing on iOS via raw bitmap modification, Core Graphics, Core Image, and GPUImage in this 2-part tutorial series. By Jack Wu.

Leave a rating/review
Save for later

Imagine you just took the best selfie of your life. It’s spectacular, it’s magnificent and worthy your upcoming feature in Wired. You’re going to get thousands of likes, up-votes, karma and re-tweets, because you’re absolutely fabulous. Now if only you could do something to this photo to shoot it through the stratosphere…

That’s what image processing is all about! With image processing, you can apply fancy effects to photos such as modifying colors, blending other images on top, and much more.

In this two-part tutorial series, you’re first going to get a basic understanding of image processing. Then, you’ll make a simple app that implements a “spooky image filter” and makes use of four popular image processing methods:

  1. Raw bitmap modification
  2. Using the Core Graphics Library
  3. Using the Core Image Library
  4. Using the 3rd-party GPUImage library

In this first segment of this image processing tutorial, you’ll focus on raw bitmap modification. Once you understand this basic process, you’ll be able to understand what happens with other frameworks. In the second part of the series, you’ll learn about three other methods to make your selfie, and other images, look remarkable.

This tutorial assumes you have basic knowledge of iOS and Objective-C, but you don’t need any previous image processing knowledge.

Getting Started

Before you start coding, it’s important to understand several concepts that relate to image processing. So, sit back, relax and soak up this brief and painless discussion about the inner workings of images.

First things first, meet your new friend who will join you through tutorial…………drumroll……Ghosty!



Now, don’t be afraid, Ghosty isn’t a real ghost. In fact, he’s an image. When you break him down, he’s really just a bunch of ones and zeroes. That’s far less frightening than working with an undead subject.

What’s an Image?

An image is a collection of pixels, and each one is assigned a single, specific color. Images are usually arranged as arrays and you can picture them as 2-dimensional arrays.

Here is a much smaller version of Ghosty, enlarged:


The little “squares” in the image are pixels, and each one shows only one color. When hundreds and thousands of pixels come together, they create a digital image.

How are Colors Represented in Bytes?

There are numerous ways to represent a color. The method that you’re going to use in this tutorial is probably the easiest to grasp: 32-bit RGBA.

As the name entails, 32-bit RGBA stores a color as 32 bits, or 4 bytes. Each byte stores a component, or channel. The four channels are:

  • R for red
  • G for green
  • B for blue
  • A for alpha.

As you probably already know, red, green and blue are a set of primary colors for digital formats. You can create almost any color you want from mixing them the right way.

Since you’re using 8-bits for each channel, the total amount of opaque colors you can actually create by using different RGB values in 32-bit RGBA is 256 * 256 * 256, which is approximately 17 million colors. Whoa man, that’s a lot of color!

The alpha channel is quite different from the others. You can think of it as transparency, just like the alpha property of UIView.

The alpha of a color doesn’t really mean anything unless there’s a color behind it; its main job is to tell the graphics processor how transparent the pixel is, and thus, how much of the color beneath it should show through.

You’ll get to dive into depth when you work through the section on blending.

To conclude this section, an image is a collection of pixels, and each pixel is encoded to display a single color. For this lesson, you’ll work with 32-bit RGBA.

Note: Have you ever wondered where the term Bitmap originated? A bitmap is a 2D map of pixels, each one comprised of bits! It’s literally a map of bits. Ah-ha!

So, now you know the basics of representing colors in bytes. There are still a three more concepts to cover before you dig in and start coding.

Color Spaces

The RGB method to represent colors is an example of a colorspace. It’s one of many methods that stores colors. Another colorspace is grayscale.

As the name entails, all images in the grayscale colorspace are black and white, and you only need to save one value to describe its color.

The downside of RGB is that it’s not very intuitive for humans to visualize.

Red: 0 Green:104 Blue:55

Red: 0 Green:104 Blue:55

For example, what color do you think an RGB of [0, 104, 55] produces?

Taking an educated guess, you might say a teal or skyblue-ish color, which is completely wrong. Turns out it’s the dark green you see on this website!

Two other more popular color spaces are HSV and YUV.

HSV, which stand for Hue, Saturation and Value, is a much more intuitive way to describe colors. You can think of the parts this way:

  • Hue as “Color”
  • Saturation as “How full is this color”
  • Value, as the “Brightness”

In this color space, if you found yourself looking at unknown HSV values, it’s much easier to imagine what the color looks like based on the values.

The difference between RGB and HSV are pretty easy to understand, at least once you look at this image:

Image modified from work by SharkD under the Creative Commons Attribution-Share Alike 3.0 Unported license

Image modified from work by SharkD under the Creative Commons Attribution-Share Alike 3.0 Unported license

YUV is another popular color space, because it’s what TVs use.

Television signals came into the world with one channel, Grayscale. Later, two more channels “came into the picture” when color film emerged. Since you’re not going to tinker with YUV in this tutorial, you might want to do some more research on YUV and other color spaces to round out your knowledge. :]

Note: For the same color space, you can still have different representations for colors. One example is 16-bit RGB, which optimizes memory use by using 5 bits for R, 6 bits for G, and 5 bits for B.

Why 6 for green, and 5 for red and blue? This is an interesting question and the answer comes from your eyeball. Human eyes are most sensitive to green and so an extra bit enables us to move more finely between different shades of green.

Coordinate Systems

Since an image is a 2D map of pixels, the origin needs specification. Usually it’s the top-left corner of the image, with the y-axis pointing downwards, or at the bottom left, with the y-axis pointing upwards.

There’s no “correct” coordinate system, and Apple uses both in different places.

Currently, UIImage and UIView use the top-left corner as the origin and Core Image and Core Graphics use the bottom-left. This is important to remember so you know where to find the bug when Core Image returns an “upside down” image.

Image Compression

This is the last concept to discuss before coding! With raw images, each pixel is stored individually in memory.

If you do the math on an 8 megapixel image, it would take 8 * 10^6 pixels * 4 bytes/pixel = 32 Megabytes to store! Talk about a data hog!

This is where JPEG, PNG and other image formats come into play. These are compression formats for images.

When GPUs render images, they decompress images to their original size, which can take a lot of memory. If your app takes up too much memory, it could be terminated by the OS (which looks to the user like a crash). So be sure to test your app with large images!

I'm dying for some action

I’m dying for some action

Looking at Pixels

Now that you have a basic understanding of the inner workings of images, you’re ready to dive into coding. Today you’re going to work through developing a selfie-revolutionizing app called SpookCam, the app that puts a little Ghosty in your selfie!

Download the starter kit, open the project in Xcode and build and run. On your phone, you should see tiny Ghosty:

screenshot2 - ghosty

In the console, you should see an output like this:

Screenshot1 - pixel output

Currently the app is loading the tiny version of Ghosty from the bundle, converting it into a pixel buffer and printing out the brightness of each pixel to the log.

What’s the brightness? It’s simply the average of the red, green and blue components.

Pretty neat. Notice how the outer pixels have a brightness of 0, which means they should be black. However, since their alpha value is 0, they are actually transparent. To verify this, try setting imageView background color to red, then build and run gain.


Now take a quick glance through the code. You’ll notice ViewController.m uses UIImagePickerController to pick images from the album or to take pictures with the camera.

After it selects an image, it calls -setupWithImage:. In this case, it outputs the brightness of each pixel to the log. Locate logPixelsOfImage: inside of ViewController.m, and review the first part of the method:

// 1.
CGImageRef inputCGImage = [image CGImage];
NSUInteger width = CGImageGetWidth(inputCGImage);
NSUInteger height = CGImageGetHeight(inputCGImage);

// 2.
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;

UInt32 * pixels;
pixels = (UInt32 *) calloc(height * width, sizeof(UInt32));

// 3.
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(pixels, width, height, bitsPerComponent, bytesPerRow, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);

// 4.
CGContextDrawImage(context, CGRectMake(0, 0, width, height), inputCGImage);

// 5. Cleanup

Now, a section-by-section recap:

  1. Section 1: Convert the UIImage to a CGImage object, which is needed for the Core Graphics calls. Also, get the image’s width and height.
  2. Section 2: For the 32-bit RGBA color space you’re working in, you hardcode the parameters bytesPerPixel and bitsPerComponent, then calculate bytesPerRow of the image. Finally, you allocate an array pixels to store the pixel data.
  3. Section 3: Create an RGB CGColorSpace and a CGBitmapContext, passing in the pixels pointer as the buffer to store the pixel data this context holds. You’ll explore Core Graphics in more depth in a section below.
  4. Section 4: Draw the input image into the context. This populates pixels with the pixel data of image in the format you specified when creating context.
  5. Section 5: Cleanup colorSpace and context.

Note: When you display an image, the device’s GPU decodes the encoding to display it on the screen. To access the data locally, you need to obtain a copy of the pixels, just like you’re doing here.

At this point, pixels holds the raw pixel data of image. The next few lines iterate through pixels and print out the brightness:

// 1.
#define Mask8(x) ( (x) & 0xFF )
#define R(x) ( Mask8(x) )
#define G(x) ( Mask8(x >> 8 ) )
#define B(x) ( Mask8(x >> 16) )
NSLog(@"Brightness of image:");
// 2.
UInt32 * currentPixel = pixels;
for (NSUInteger j = 0; j < height; j++) {
  for (NSUInteger i = 0; i < width; i++) {
    // 3.
    UInt32 color = *currentPixel;
    printf("%3.0f ", (R(color)+G(color)+B(color))/3.0);
    // 4.

Here's what's going on:

  1. Define some macros to simplify the task of working with 32-bit pixels. To access the red component, you mask out the first 8 bits. To access the others, you perform a bit-shift and then a mask.
  2. Get a pointer of the first pixel and start 2 for loops to iterate through the pixels. This could also be done with a single for loop iterating from 0 to width * height, but it's easier to reason about an image that has two dimensions.
  3. Get the color of the current pixel by dereferencing currentPixel and log the brightness of the pixel.
  4. Increment currentPixel to move on to the next pixel. If you're rusty on pointer arithmetic, just remember this: Since currentPixel is a pointer to UInt32, when you add 1 to the pointer, it moves forward by 4 bytes (32-bits), to bring you to the next pixel.

Note: An alternative to the last method is to declare currentPixel as a pointer to an 8-bit type (ie char). This way each time you increment, you move to the next component of the image. By dereferencing it, you get the 8-bit value of that component.

At this point, the starter project is simply logging raw image data, but not modifying anything yet. That's your job for the rest of the tutorial!

SpookCam - Raw Bitmap Modification

Of the four methods explored in this series, you'll spend the most time on this one because it covers the "first principles" of image processing. Mastering this method will allow you to understand what all the other libraries do.

In this method, you'll loop through each pixel, as the starter kit already does, but this time assign new values to each pixel.

This advantage of this method is that it's easy to implement and understand; the disadvantage is that scaling to larger images and effects that are more complicated is less than elegant.

As you see in the starter app, the ImageProcessor class already exists. Hook it up to the main ViewController by replacing -setupWithImage: with the following code in ViewController.m:

- (void)setupWithImage:(UIImage*)image {
  UIImage * fixedImage = [image imageWithFixedOrientation];
  self.workingImage = fixedImage;
  // Commence with processing!
  [ImageProcessor sharedProcessor].delegate = self;
  [[ImageProcessor sharedProcessor] processImage:fixedImage];

Also comment out the following line of code in -viewDidLoad:

// [self setupWithImage:[UIImage imageNamed:@"ghost_tiny.png"]];

Now take a look at ImageProcessor.m. As you can see, ImageProcessor is a singleton object that calls -processUsingPixels: on an input image, then returns the output through the ImageProcessorDelegate.

-processUsingPixels: is currently a copy of the code you looked at previously that gives you access to the pixels of inputImage. Notice the two extra macros A(x) and RGBAMake(r,g,b,a) that are defined to provide convenience.

Now build and run. Choose an image from your album (or take a photo) and you should see it appear in your view like this:


That looks way too relaxing, time to bring in Ghosty!

Before the return statement in processUsingPixels:, add the following code to get an CGImageRef of Ghosty:

UIImage * ghostImage = [UIImage imageNamed:@"ghost"];
CGImageRef ghostCGImage = [ghostImage CGImage];

Now, do some math to figure out the rect where you want to put Ghosty inside the input image.

CGFloat ghostImageAspectRatio = ghostImage.size.width / ghostImage.size.height;
NSInteger targetGhostWidth = inputWidth * 0.25;
CGSize ghostSize = CGSizeMake(targetGhostWidth, targetGhostWidth / ghostImageAspectRatio);
CGPoint ghostOrigin = CGPointMake(inputWidth * 0.5, inputHeight * 0.2);

This code resizes Ghosty to take up 25% of the input's width, and places his origin (top-left corner) at ghostOrigin.

The next step is to get the pixel buffer of Ghosty, this time with scaling:

NSUInteger ghostBytesPerRow = bytesPerPixel * ghostSize.width;
UInt32 * ghostPixels = (UInt32 *)calloc(ghostSize.width * ghostSize.height, sizeof(UInt32));
CGContextRef ghostContext = CGBitmapContextCreate(ghostPixels, ghostSize.width, ghostSize.height,
                                       bitsPerComponent, ghostBytesPerRow, colorSpace,
                                       kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGContextDrawImage(ghostContext, CGRectMake(0, 0, ghostSize.width, ghostSize.height),ghostCGImage);

This is similar to how you got pixels from inputImage. However, by drawing Ghosty into a smaller size and width, he becomes a little smaller.

Now you're ready to blend Ghosty into your image, which makes this the perfect time to go over blending.

Blending: As mentioned before, each color has an alpha value that indicates transparency. However, when you're creating an image, each pixel has exactly one color.

So how do you assign a pixel if it has a background color and a "semi-transparent" color on top of it?

The answer is alpha blending. The color on top uses a formula and its alpha value to blend with the color behind it. Here you treat alpha as a float between 0 and 1:

NewColor = TopColor * TopColor.Alpha + BottomColor * (1 - TopColor.Alpha)

This is the standard linear interpolation equation.

  • When the TopColor.Alpha is 1, NewColor is equal to TopColor.
  • When TopColor.Alpha is 0, NewColor is equal to BottomColor.
  • Finally, when TopColor.Alpha is between 0 and 1, NewColor is a blend of TopColor and BottomColor.

A popular optimization is to use premultiplied alpha. The idea is to premultiply TopColor by TopColor.alpha, thereby saving that multiplication in the formula above.

As trivial as that sounds, it offers a noticeable performance boost when iterating through millions of pixels to perform blending.

Okay, back to Ghosty.

As with most bitmap image processing algorithms, you need some for loops to go through all the pixels. However, you only need to loop through the pixels you need to change.

Add this code to the bottom of processUsingPixels:, again right before the return statement:

NSUInteger offsetPixelCountForInput = ghostOrigin.y * inputWidth + ghostOrigin.x;
for (NSUInteger j = 0; j < ghostSize.height; j++) {
  for (NSUInteger i = 0; i < ghostSize.width; i++) {
    UInt32 * inputPixel = inputPixels + j * inputWidth + i + offsetPixelCountForInput;
    UInt32 inputColor = *inputPixel;
    UInt32 * ghostPixel = ghostPixels + j * (int)ghostSize.width + i;
    UInt32 ghostColor = *ghostPixel;
    // Do some processing here      

Notice how you only loop through the number of pixels in Ghosty's image, and offset the input image by offsetPixelCountForInput. Remember that although you're reasoning about images as 2-D arrays, in memory they are actually 1-D arrays.

Next, fill in this code after the comment Do some processing here to do the actual blending:

// Blend the ghost with 50% alpha
CGFloat ghostAlpha = 0.5f * (A(ghostColor) / 255.0);
UInt32 newR = R(inputColor) * (1 - ghostAlpha) + R(ghostColor) * ghostAlpha;
UInt32 newG = G(inputColor) * (1 - ghostAlpha) + G(ghostColor) * ghostAlpha;
UInt32 newB = B(inputColor) * (1 - ghostAlpha) + B(ghostColor) * ghostAlpha;
// Clamp, not really useful here :p
newR = MAX(0,MIN(255, newR));
newG = MAX(0,MIN(255, newG));
newB = MAX(0,MIN(255, newB));
*inputPixel = RGBAMake(newR, newG, newB, A(inputColor));

There are two points to note in this part.

  1. You apply 50% alpha to Ghosty by multiplying the alpha of each pixel by 0.5. You then blend with the alpha blend formula previously discussed.
  2. The clamping of each color to [0,255] is not required here, since the value will never go out of bounds. However, most algorithms require this clamping to prevent colors from overflowing and giving unexpected outputs.

To test this code, add this code to the bottom of processUsingPixels:, replacing the current return statement:

// Create a new UIImage
CGImageRef newCGImage = CGBitmapContextCreateImage(context);
UIImage * processedImage = [UIImage imageWithCGImage:newCGImage];
return processedImage;

This creates a new UIImage from the context and returns it. You're going to ignore the potential memory leak here for now.

Build and run. You should see Ghosty floating in your image like, well, a ghost:


Good work so far, this app is going viral for sure!

Black and White

One last effect to go. Try implementing the black and white filter yourself. To do this, set each pixel's red, green and blue components to the average of the three channels in the original, just like how you printed out Ghosty's brightness in the beginning.

Write this code before the // Create a new UIImage comment you added in the previous step.

Think you got it? Check your code here.
[spoiler title="Solution"]

// Convert the image to black and white
for (NSUInteger j = 0; j < inputHeight; j++) {
  for (NSUInteger i = 0; i < inputWidth; i++) {
    UInt32 * currentPixel = inputPixels + (j * inputWidth) + i;
    UInt32 color = *currentPixel;
    // Average of RGB = greyscale
    UInt32 averageColor = (R(color) + G(color) + B(color)) / 3.0;
    *currentPixel = RGBAMake(averageColor, averageColor, averageColor, A(color));


The very last step is to cleanup your memory. ARC cannot manage CGImageRefs and CGContexts for you. Add this to the end of the function before the return statement:

// Cleanup!

Build and run. Be prepared to be spooked out by the result:


Where To Go From Here?

Congratulations! You just finished your first image-processing application. You can download a working version of the project at this point here.

That wasn't too hard, right? You can play around with the code inside the for loops to create your own effects, try to see if you can implement these ones:

  • Swap the red and blue channels of the image
  • Increase the brightness of the image by 10%
  • As a further challenge, try scaling Ghosty using only pixel-based methods. Here are the steps:
    1. Create a new CGContext with the target size for Ghosty.
    2. For each pixel in this new Context, calculate which pixel you should copy from in the original image.
    3. For extra coolness, try interpolating between nearby pixels if your calculations for the original coordinate lands in-between pixels. If you interpolate between the four nearest pixels, you have just implemented Bilinear scaling all on your own! What a boss!

If you've completed the first project, you should have a pretty good grasp on the basic concepts of image processing. Now you can set out and explore simpler and faster ways to accomplish these same effects.

In the next part of the series, you replace -processUsingPixels: with three new functions that will perform the same task using different libraries. Definitely check it out!

In the meantime, if you have any questions or comments about the series so far, please join the forum discussion below!

Jack Wu


Jack Wu


Over 300 content creators. Join our team.