Building a Recommendation App With Create ML in SwiftUI

Learn how to train a model and how to give it prediction capability using Core ML and Create ML in SwiftUI. By Saeed Taheri.

Leave a rating/review
Download materials
Save for later
Share
You are currently viewing page 2 of 3 of this article. Click here to view the first page.

Using TabularData for Training

Apple introduced a new framework in iOS 15 called TabularData. By utilizing this framework, you can import, organize and prepare a table of data to train a machine learning model.

Add the following to the top of RecommendationStore.swift:

import TabularData

Now create a method inside RecommendationStore:

private func dataFrame(for data: [FavoriteWrapper<Shirt>]) -> DataFrame {
  // Coming soon
}

The return type is DataFrame, a collection that arranges data in rows and columns. It is the base structure for your entry point into the TabularData framework.

You have options for handling the training data. In the next step, you’ll import it. But you could also use a CSV or JSON file that includes the provided initializers on DataFrame.

Replace the comment inside the method you created with the following:

// 1
var dataFrame = DataFrame()

// 2
dataFrame.append(column: Column(
  name: "color", 
  contents: data.map(\.model.color.rawValue))
)

// 3
dataFrame.append(column: Column(
  name: "design", 
  contents: data.map(\.model.design.rawValue))
)

dataFrame.append(column: Column(
  name: "neck",
  contents: data.map(\.model.neck.rawValue))
)

dataFrame.append(column: Column(
  name: "sleeve", 
  contents: data.map(\.model.sleeve.rawValue))
)

// 4
dataFrame.append(column: Column<Int>(
    name: "favorite",
    contents: data.map {
      if let isFavorite = $0.isFavorite {
        return isFavorite ? 1 : -1
      } else {
        return 0
      }
    }
  )
)

// 5
return dataFrame

Here is a step-by-step description of the above code:

  1. Initialize an empty DataFrame.
  2. Arrange the data into columns and rows. Each column has a name. Create a column for the color then fill it with all the data that’s been reduced to only color using map and a keypath.
  3. Append other columns to the data frame that are suitable for prediction: design, neck and sleeve. Bear in mind that the item count inside each column needs to be the same; otherwise, you’ll have a runtime crash.
  4. Append another column to record favorite status of each item. If the value is not nil and it’s true then add a 1. But, if it’s false then add a -1. If the value is nil add a 0 to indicate the user hasn’t made a decision about it. This step uses numbers — not Booleans — so you can apply a regression algorithm later.
  5. Return the data frame.
Note: At the time of writing, Create ML methods don’t offer asynchronous implementations. It is possible, of course, to use the old and familiar Grand Central Dispatch, or GCD.

Now, add an instance property to the class to hold a reference to a DispatchQueue:

private let queue = DispatchQueue(
  label: "com.recommendation-service.queue",
  qos: .userInitiated)

Label it whatever you want. The qos parameter stands for Quality of Service. It determines the priority at which the system schedules the task for execution.

Now, it’s time to get back to computeRecommendations(basedOn:).

This function is an async method and needs to be converted to a GCD async task to work with Swift’s async functions.

Replace the return statement inside the method’s implementation with:

return try await withCheckedThrowingContinuation { continuation in
  // Coming soon
}

The withCheckedThrowingContinuation closure suspends the current task then calls the given closure with continuation. A continuation is a mechanism to interface between synchronous and asynchronous code.

Inside this closure, call async on the queue you defined earlier:

queue.async {
  // Don't be hasty
}

When your result is ready inside the closure of the GCD queue, you call resume(returning:) on the continuation parameter. If any error occurs inside this queue then you call resume(throwing:).

The system will convert those calls into the async throws signature of Swift’s concurrency system.

From now on, all the code you’ll write will be inside the GCD’s async method you wrote.

Add a target check to throw an error on the simulator.

#if targetEnvironment(simulator)
continuation.resume(
  throwing: NSError(
    domain: "Simulator Not Supported", 
    code: -1
  )
)
#else
// Write the next code snippets here
#endif

Add a variable to hold the training data inside the #else block:

let trainingData = items.filter {
  $0.isFavorite != nil
}

OK, so now you have a place to hold training data, but what exactly is this data? According to the definition you just created, the trainingData constant will include all the items where the user has taken an action.

  • Training Data: The sample of data you use to fit the model.
  • Validation Data: The sample of data held back from training your model. Its purpose is to give an estimate of model skill while tuning the model’s parameters.
  • Test Data: The sample of data you use to assess the created model.
Note: Get to know these three main terms related to data in training ML models:

Below your previous code, create a data frame using the trainingData constant and dataFrame(for:), which you created earlier.

let trainingDataFrame = self.dataFrame(for: trainingData)

Here you tell the recommendation system to infer the results based on all the items, whether the user acted on them or not.

Finally, add the following:

let testData = items
let testDataFrame = self.dataFrame(for: testData)

This creates the constants for your test data.

The training and test datasets are ready.

Predicting T-shirt Tastes

Now that your data is in order, you get to incorporate an algorithm to actually do the prediction. Say hello to MLLinearRegressor! :]

Implementing Regression

First, add the import directive to the top of the file as follows:

#if canImport(CreateML)
import CreateML
#endif

You conditionally import CreateML because this framework isn’t available on the simulator.

Next, immediately after your code to create the test data constants, create a regressor with the training data:

do {
  // 1
  let regressor = try MLLinearRegressor(
    trainingData: trainingDataFrame, 
    targetColumn: "favorite")
  
} catch {
  // 2
  continuation.resume(throwing: error)
}

Here’s what the code does:

  1. Create a regressor to estimate the favorite target column as a linear function of the properties in the trainingDataFrame.
  2. If any errors happen, you resume the continuation using the error. Don’t forget that you’re still inside the withCheckedThrowingContinuation(function:_:) closure.

You may ask what happened to the validation data.

If you jump to the definition of the MLLinearRegressor initializer, you’ll see this:

public init(
  trainingData: DataFrame, 
  targetColumn: String, 
  featureColumns: [String]? = nil, 
  parameters: MLLinearRegressor.ModelParameters =
    ModelParameters(
      validation: .split(strategy: .automatic)
    )
) throws

Two default parameters exist for featureColumns and parameters.

You set featureColumns to nil, so the regressor will use all columns apart from the specified targetColumn to create the model.

The default value for parameters implies the regressor splits the training data and uses some of it for verification purposes. You can tune this parameter based on your needs.

Beneath where you defined the regressor, add this:

let predictionsColumn = (try regressor.predictions(from: testDataFrame))
  .compactMap { value in
    value as? Double
  }

You first call predictions(from:) on testDataFrame, and the result is a type-erased AnyColumn. Since you specified the targetColumn — remember that is the favorite column — to be a numeric value you cast it to Double using compactMap(_:).

Good work! You’ve successful built the model and implemented the regression algorithm.