Chapters

Hide chapters

Expert Swift

First Edition · iOS 14 · Swift 5.4 · Xcode 12.5

9. Unsafe
Written by Ehab Amer

Heads up... You're reading this book for free, with parts of this chapter shown beyond this point as scrambled text.

Swift is an easy language to learn. It can take care of a lot of things for you and help you keep your code safe and clear to minimize bugs. If you were to compare it to C++, many people would say C++ is harder. Swift takes care of type checking, memory allocation, and many things on your behalf so you can focus on what you want to do in your code, and not how the machine will handle your code. But C++ gives you more power and more control. As we’re told in the Spider-Man comics and movies, “With great power comes great responsibility”.

By default, Swift is a memory-safe and type-safe language. This means you cannot access uninitialized memory and can only treat an instance as the type it was created. You can’t treat a String as if it were an Int or a Numeric and vice-versa. But this doesn’t cover completely what the word safe means.

For a more general description, Swift validates any input, whether it’s valid or invalid, and behaves accordingly. So storing a number in a string property, for example, will fail. Additionally, forcing a value from an optional that doesn’t have a value is not a valid behavior. Neither is storing a number that exceeds the maximum allowed value of your variable. All of those are different cases related to safety.

In some cases, you might need your code to be extremely optimized, in which case the tiny overhead added by the safety checks from Swift might be too expensive. You might be dealing with a huge stream of real-time data, manipulating large files or other large operations that deal with large data. Or you might even be working with C++ code within your app. In such cases, you want to have full control over your objects, or in other words: Pointers.

In this chapter, you’ll learn how you can gain this control. You’ll learn about:

  • The memory layout of types, and what size, alignment and stride are
  • How to use typed and untyped pointers
  • Binding memory to a type and the rules you must follow to rebind it to another type
  • Other unsafe operations in the standard library and overflow arithmetic operations

But before going into those points, you need to understand a few things first.

Definition of unsafe & undefined behaviors

As stated earlier, type safety means that Swift checks any input or operation whether it is valid or not and behaves accordingly. However, there is also a whole other world in Swift that has the keyword unsafe. This gives you more control and moves the responsibility of validation to you, the developer. Swift will trust that you know what you’re doing.

Before going deeper into what this keyword means, you must understand how Swift behaves when you violate any of the type safety rules. Some violations are checked at compile time, while others are checked during runtime — and those consistently cause a runtime crash. A rule to remember: Safe code doesn’t mean no crashes. It means that if your code received unexpected input, it will stop execution. One of the ways it can do that is to throw a fatal error. But with unsafe code, it will use the invalid input, work with it and eventually — maybe — provide an output. Such situations are hard to debug.

This is how the keyword unsafe works. The moment a rule is violated, the behavior of your code is completely unknown. Your code might crash, or it might resume. It might give you a wrong value or change the value of another property. How your application will proceed is undefined and can change from one execution to another. It’s extremely important to know how your code will behave and what to expect once you start using unsafe so you’re careful with it.

The Swift standard library provides pointers for unsafe that are similar in concept to C++ pointers. There is no better way to learn how to use these pointers than to understand how they work.

What is a pointer?

Swift has a linear memory layout, so imagine your app’s address space is from 0x0000 to 0xFFFF. The actual address space is represented by 64 bits rather than 16. But to keep it simple here, this chapter will use smaller numbers.

6p4188 5v3AW4 5xBTZG 9n8AJ1

Pointer vs. reference

In a way, they’re quite similar, yet there is quite a difference. Behind the scenes, a reference is a pointer, but pointer operations aren’t available to you. When you work with pointers, you take care of the life-cycle of the pointer itself as well as the object it points at. Normally, you define a pointer and then allocate and initialize the object it points at. If you lose that pointer and you didn’t clear out this object, you can never reach it again. And if you delete the object and keep the pointer, you’ll come across a variety of undefined behaviors if you try to use that pointer again.

Memory layout

To use pointers properly, you must understand how memory itself is organized. Memory layout for value types is quite different than the layout of reference types. This section will start by covering value types.

Layout for Swift types

You can determine those values directly through code. For example, for the type Int, you can use the enum MemoryLayout to see those values.

MemoryLayout<Int>.size          // returns 8 (on 64-bit)
MemoryLayout<Int>.alignment     // returns 8 (on 64-bit)
MemoryLayout<Int>.stride        // returns 8 (on 64-bit)
MemoryLayout<Int16>.size        // returns 2
MemoryLayout<Int16>.alignment   // returns 2
MemoryLayout<Int16>.stride      // returns 2

MemoryLayout<Bool>.size         // returns 1
MemoryLayout<Bool>.alignment    // returns 1
MemoryLayout<Bool>.stride       // returns 1

MemoryLayout<Float>.size        // returns 4
MemoryLayout<Float>.alignment   // returns 4
MemoryLayout<Float>.stride      // returns 4

MemoryLayout<Double>.size       // returns 8
MemoryLayout<Double>.alignment  // returns 8
MemoryLayout<Double>.stride     // returns 8
let zero = 0.0
MemoryLayout.size(ofValue: zero) // returns 8

Trivial types

You can copy a trivial type bit for bit with no indirection or reference-counting operations. Generally, native Swift types that don’t contain strong or weak references or other forms of indirection are trivial, as are imported C++ structs and enums.

struct IntBoolStruct {
  var intValue: Int
  var boolValue: Bool
}
MemoryLayout<IntBoolStruct>.size       // returns 9
MemoryLayout<IntBoolStruct>.alignment  // returns 8
MemoryLayout<IntBoolStruct>.stride     // returns 16
7 6 7 3 1 6 6 8 6 66 7 42 43 03 26 31

Ordering properties

Now, consider this other example:

struct BoolIntStruct {
  var boolValue: Bool
  var intValue: Int
}

MemoryLayout<BoolIntStruct>.size       // returns 16
MemoryLayout<BoolIntStruct>.alignment  // returns 8
MemoryLayout<BoolIntStruct>.stride     // returns 16
1 5 3 4 3 3 2 8 4 74 4 81 55 22 35 37

Allocating for alignment

The two examples above don’t mean that any extra consideration is required for the ordering of the properties. The padding remained the same in the two examples, except only one of them considered it in the size property.

struct EmptyStruct {}

MemoryLayout<EmptyStruct>.size       // returns 0
MemoryLayout<EmptyStruct>.alignment  // returns 1
MemoryLayout<EmptyStruct>.stride     // returns 1

Reference types

Reference types have a quite different memory layout. When you have a pointer of such a type, you’re pointing to a reference of that value and not the value itself. Think of it as if you have a pointer on a pointer.

class IntBoolClass {
  var intValue: Int = 0
  var boolValue: Bool = false
}

MemoryLayout<IntBoolClass>.size       // returns 8
MemoryLayout<IntBoolClass>.alignment  // returns 8
MemoryLayout<IntBoolClass>.stride     // returns 8

class BoolIntClass {
  var boolValue: Bool = false
  var intValue: Int = 0
}

MemoryLayout<BoolIntClass>.size       // returns 8
MemoryLayout<BoolIntClass>.alignment  // returns 8
MemoryLayout<BoolIntClass>.stride     // returns 8

class EmptyClass {}

MemoryLayout<EmptyClass>.size       // returns 8
MemoryLayout<EmptyClass>.alignment  // returns 8
MemoryLayout<EmptyClass>.stride     // returns 8

Pointer types

Swift provides different pointer types. Each provides its own control safety levels or unsafety levels.

Raw pointers

To understand raw pointers, consider the following example. Create a playground and add the following code:

var int16Value: UInt16 = 0x1122 // 4386
MemoryLayout.size(ofValue: int16Value) // 2
MemoryLayout.stride(ofValue: int16Value) // 2
MemoryLayout.alignment(ofValue: int16Value) // 2
let int16bytesPointer = UnsafeMutableRawPointer.allocate(
  byteCount: 2,
  alignment: 2)
defer {
  int16bytesPointer.deallocate()
}
int16bytesPointer.storeBytes(of: 0x1122, as: UInt16.self)
let firstByte = int16bytesPointer.load(as: UInt8.self)  // 34 (0x22)
let offsetPointer = int16bytesPointer + 1
let secondByte = offsetPointer.load(as: UInt8.self)     // 17 (0x11)

Unsafety of raw pointers

Now, nothing is stopping you from reading more addresses using int16bytesPointer. You can read the next address:

let offsetPointer2 = int16bytesPointer + 2
let thirdByte = offsetPointer2.load(as: UInt8.self)  // Undefined
offsetPointer2.storeBytes(of: 0x3344, as: UInt16.self)
let misalignedUInt16 = offsetPointer.load(as: UInt16.self)
Fatal error: load from misaligned raw pointer

Raw buffer pointers

Raw buffers provide a way to go through a block of memory as if it were an array of UInt8.

let size = MemoryLayout<UInt>.size // 8
let alignment = MemoryLayout<UInt>.alignment // 8

let bytesPointer = UnsafeMutableRawPointer.allocate(
  byteCount: size,
  alignment: alignment)
defer {
  bytesPointer.deallocate()
}
bytesPointer.storeBytes(of: 0x0102030405060708, as: UInt.self)
let bufferPointer = UnsafeRawBufferPointer(
  start: bytesPointer,
  count: 8)
for (offset, byte) in bufferPointer.enumerated() {
  print("byte \(offset): \(byte)")
}
byte 0: 8
byte 1: 7
byte 2: 6
byte 3: 5
byte 4: 4
byte 5: 3
byte 6: 2
byte 7: 1

Typed pointers

In the raw pointer examples above, you needed to tell the compiler a value’s type every time you read it. This can be very tedious if you’re using the same type over and over.

let count = 4

let pointer = UnsafeMutablePointer<Int>.allocate(capacity: count) // 1
pointer.initialize(repeating: 0, count: count) // 2
defer {
  pointer.deinitialize(count: count)
  pointer.deallocate()
}
// 3
pointer.pointee = 10001
pointer.advanced(by: 1).pointee = 10002
(pointer+2).pointee = 10003
pointer.advanced(by: 3).pointee = 10004

pointer.pointee // 10001
pointer.advanced(by: 1).pointee // 10002
(pointer+1).pointee // 10002
pointer.advanced(by: 2).pointee // 10003
(pointer+3).pointee // 10004
// 4
let bufferPointer = UnsafeBufferPointer(
  start: pointer,
  count: count)
for (offset, value) in bufferPointer.enumerated() {
  print("value \(offset): \(value)")
}
value 0: 10001
value 1: 10002
value 2: 10003
value 3: 10004

Memory binding

Memory binding means specifying an area in memory as a value of a specific type. For example, if you specify the four bytes between 0x0010 and 0x0013 as an Int32, this means you bound them to that type. If you just read or write on them once as Int32, that doesn’t count as binding.

Punning

Type punning is when a part of memory is bound to a type, then you bind it to a different and unrelated type.

let rawPointer = UnsafeMutableRawPointer.allocate(byteCount: 2, alignment: 2)
defer {
  rawPointer.deallocate()
}

let float16Pointer = rawPointer.bindMemory(to: Float16.self, capacity: 1)
let uint8Pointer = rawPointer.bindMemory(to: UInt8.self, capacity: 2)
float16Pointer.pointee = 0xABC0 // 43968

uint8Pointer.pointee // 0x5E = 94
uint8Pointer.advanced(by: 1).pointee // 0x79 = 121
uint8Pointer.pointee -= 1

float16Pointer.pointee // 43936

Related types

In the last example, you bound the float pointer to another unsigned int-8 pointer and read that value as a UInt8. That value was completely unrelated to what was stored for the Float16. Thus, the rebinding here was wrong. So when is the rebinding right?

Layout compatibility

Remember the memory layout explanation at the beginning of this chapter? To say two types are mutually layout compatible means they have the same size and alignment or contain the same number of layout compatible types.

Strict aliasing

If you have two pointers of value types or class types, they both must be related. This means that changing the value of one pointer changes the other pointer in the same way. In such cases, both pointers are aliases to each other.

Safe rebinding

Swift provides three different APIs to bind/rebind pointers:

let count = 3
let size = MemoryLayout<Int16>.size
let stride = MemoryLayout<Int16>.stride
let alignment = MemoryLayout<Int16>.alignment
let byteCount =  count * stride

let rawPointer = UnsafeMutableRawPointer.allocate(
  byteCount: byteCount,
  alignment: alignment)
defer {
  rawPointer.deallocate()
}

let typedPointer1 = rawPointer.bindMemory(
  to: UInt16.self,
  capacity: count)
typedPointer1.withMemoryRebound(
  to: Bool.self,
  capacity: count * size) {
  (boolPointer: UnsafeMutablePointer<Bool>) in
  print(boolPointer.pointee)
}
func initRawAB() -> UnsafeMutableRawPointer {
  let rawPtr = UnsafeMutableRawPointer.allocate(
    byteCount: 2 * MemoryLayout<UInt16>.stride,
    alignment: MemoryLayout<UInt16>.alignment)

  let boundP1 = rawPtr.bindMemory(to: UInt16.self, capacity: 1)
  boundP1.pointee = 101

  let boundP2 = rawPtr.advanced(by: 2).bindMemory(to: Float16.self, capacity: 1)
  boundP2.pointee = 202.5

  return rawPtr
}
let rawPtr = initRawAB()

let assumedP1 = rawPtr
  .assumingMemoryBound(to: UInt16.self)
assumedP1.pointee // 101

let assumedP2 = rawPtr
  .advanced(by: 2)
  .assumingMemoryBound(to: Float16.self)
assumedP2.pointee // 202.5

Unsafe operations

As mentioned before, safe code isn’t code that doesn’t crash. It’s code that behaves consistently.

Unsafe unwrap

Consider the following code:

var safeString: String? = nil
print(safeString!)
var unsafeString: String? = nil
print(unsafeString.unsafelyUnwrapped)

Unsafe unowned

As you already know, marking a property as unowned in the capture list of a closure means not to increment the reference count of this property while using it as a non-optional. In a way, saying it’s unowned somewhat guarantees that this will never be nil when the closure executes. If for any reason it is a nil, your code will crash.

Overflow operations

The last point relates to arithmetic operations. When you do any operation on a number, the compiler makes sure that the data-type you’re using can store the value.

UInt8.max + 1
error: arithmetic operation '255 + 1' (on type 'UInt8') results in an overflow
UInt8.max + 1
UInt8.max &+ 1 // 0

Int8.max &+ 1 // -128
Int8.max &* 2 // -2
Int8.min &- 1 // 127

Key points

  • Safe code means the behavior is always expected even if the input is unexpected. Crashing is considered a safe behavior if the input is not allowed as long as this crash is consistent.
  • References are pointers in origin. But the standard library handles their allocation, initialization and entire cycle.
  • Each type has size, alignment and stride values that control how its memory is allocated. Also, the order of the properties in each type affects those numbers.
  • The standard library has different types of unsafe pointers. Each gives a certain level of control, from pointers that access memory as raw bytes to those that know exactly the type of the bytes accessed.
  • There are several rules you must follow before you bind or rebind memory to a type.
  • Unsafe operations and overflow arithmetic can skip the safety validations on the standard library.

Where to go from here?

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2024 Kodeco Inc.

You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.

Unlock now