Protecting Core ML Models

Exploring various methods to protect your valuable on-device IP

The problem

If your in-house mobile machine learning model gives you an advantage over your competitors, you’ll probably want to keep it private. Unfortunately, at the moment, Apple doesn’t offer any approaches to keep Core ML models private.

Anyone can get your .mlmodel file (or the compiled version mlmodelc) and reuse it in third party applications. In this article, you’ll learn a few strategies for protecting your Core ML models.

How other developers can access your models

As we know, an IPA file that we get when we archive a project is just a ZIP. And there are numerous ways to get access to your app folder (e.g. using Apple Configurator 2, iTunes < 12.7, Jailbreak).

Yes, an app bundle doesn’t store a familiar .mlmodel file. Instead, it stores a compiled version of it named mlmodelc, which is just a folder that also contains all your model’s info.

And after a small investigation of your mlmodelc, other developers could integrate this model in their own app using the MLModel API.

How can we protect models

I see a few options:

Rename mlmodelc

If your model is not proprietary, but you want to support a minimum level of security, you can rename mlmodelc to e.g. .storyboardc, add it to your project, and at runtime change the name back and use it with the MLModel API.

Write custom layers

Another technique is to write custom layers on-device. Leave your model unfinished and require additional layers. The model will be useless without its full implementation.

Download models on demand

You can always download models from a backend on demand in runtime and save them in the app documents folder. Only a jailbreak will allow access to documents. Also, you can remove the downloaded model from docs before closing the application in order to minimize the time it’s kept on-device.

Encrypt/Decrypt model

Last but not least, you can encrypt your model before adding it to a project, and decrypt it at runtime on demand.

All of these above options can be suitable for different conditions and, respectively, require different levels of development complexity. For example, if you just need to support a minimum level of security, as I wrote above, you can just rename a model’s folder — a small kind of obfuscation.

Alternatively, you might opt to not finish your model and write some network’s layers manually — other developers still can investigate your model’s architecture, but not gain full access to it.

Another choice would be to store your models on a backend (or any cloud service, e.g. AWS) and download them every time a user starts your application. But if your models are large and you think that your users won’t like downloading it every time (e.g. if they use cellular network), I recommend using the last option — encrypting/decrypting your models. We’ll explore this technique below.

Model encryption/decryption

This option consists of two phases. First, we encrypt our .mlmodel file on a laptop and add an encrypted version as a resource to our project. And second, at runtime we decrypt this file and build our model using the MLModel API.

All we need to do to encrypt our .mlmodel is write a command line tool named mlmodelencoder, which will take two arguments:

  • path to your .mlmodel;
  • secret key — your passphrase, used for decryption.

If you aren’t familiar with creating a CLT based on Swift, please read this article by John Sundell.

Our MLEncoderTool class looks like this:

public struct MLEncoderTool {
  // The first argument is the file name
  // The second argument is key
  private let arguments: [String]
  
  private let cryptor: MLCryptor
  
  public init(
    arguments: [String] = CommandLine.arguments,
    cryptor: MLCryptor = MLCryptor.cryptoKit
  ) {
    self.arguments = arguments
    self.cryptor = cryptor
  }
  
  public func run() throws {
    if arguments.count == 1 {
      throw Error.missingAllArguments
    }
    
    if arguments.count == 2 {
      throw Error.missingKey
    }
    
    try encryptFile()
  }
  
  func encryptFile() throws {
    let fileName = arguments[1]
    let key = arguments[2]
    
    let currentPath = FileManager.default.currentDirectoryPath
    let fullPath = currentPath + "/" + fileName
    
    guard let data = FileManager.default.contents(atPath: fullPath) else {
      throw Error.noFile
    }
    
    guard let encryptedData = try cryptor.encrypt(data: data, withPassword: key) else {
      throw Error.undefinedError
    }
    
    let destinationPath = currentPath + "/" + fileName + ".enc"
    let destinationURL = URL(fileURLWithPath: destinationPath)
    
    try encryptedData.write(to: destinationURL)
    
    print("The file was encrypted successfully.")
  }
  
  func decryptFile() throws {
    let fileName = arguments[1]
    let key = arguments[2]
    
    let currentPath = FileManager.default.currentDirectoryPath
    let fullPath = currentPath + "/" + fileName
    
    guard let data = FileManager.default.contents(atPath: fullPath) else {
      throw Error.noFile
    }
    
    guard let encryptedData = try cryptor.decrypt(data: data, withPassword: key) else {
      throw Error.undefinedError
    }
    
    let destinationPath = currentPath + "/" + (fileName as NSString).deletingPathExtension
    let destionationURL = URL(fileURLWithPath: destinationPath)
    
    try encryptedData.write(to: destionationURL)
    
    print("The file was decrypted successfully.")
  }
}
 
public extension MLEncoderTool {
  enum Error: Swift.Error {
    case missingAllArguments
    case missingKey
    case noFile
    case undefinedError
  }
}

The logic is pretty straightforward. The whole encryption process takes place in MLCryptor.

This is an enum that encapsulates two implementations. If your app supports devices with iOS >=13, you can use Apple’s CryptoKit. In other cases, we’ll use RNCryptor (under the hood, it uses Apple’s CommonCrypto).

import Foundation
import RNCryptor
import CryptoKit
 
public enum MLCryptor {
  case cryptoKit
  case rnCryptor
  
  public func encrypt(data: Data, withPassword password: String) throws -> Data? {
    switch self {
    case .cryptoKit:
      let encryptionKey = try SymmetricKey(string: password)
      return try encryptByCryptoKit(data, withKey: encryptionKey)
    case .rnCryptor:
      return RNCryptor.encrypt(data: data, withPassword: password)
    }
  }
  
  public func decrypt(data: Data, withPassword password: String) throws -> Data? {
    switch self {
    case .cryptoKit:
      let decryptionKey = try SymmetricKey(string: password)
      return try decryptByCryptoKit(data, withKey: decryptionKey)
    case .rnCryptor:
      return try RNCryptor.decrypt(data: data, withPassword: password)
    }
  }
}
 
private extension CXMLCryptor {
  func encryptByCryptoKit(_ data: Data, withKey key: SymmetricKey) throws -> Data? {
    let sealedBox = try AES.GCM.seal(data, using: key)
    let encryptedData = sealedBox.combined
    return encryptedData
  }
  
  func decryptByCryptoKit(_ data: Data, withKey key: SymmetricKey) throws -> Data? {
    let sealedBox = try AES.GCM.SealedBox(combined: data)
    let decryptedData = try AES.GCM.open(sealedBox, using: key)
    return decryptedData
  }
}
extension SymmetricKey {
  init(string keyString: String, size: SymmetricKeySize = .bits256) throws {
    guard var keyData = keyString.data(using: .utf8) else {
      print("Could not create base64 encoded Data from String.")
      throw CryptoKitError.incorrectParameterSize
    }
    
    let keySizeBytes = size.bitCount / 8
    keyData = keyData.subdata(in: 0..<keySizeBytes)
    
    guard keyData.count >= keySizeBytes else { throw CryptoKitError.incorrectKeySize }
    self.init(data: keyData)
  }
}

After building your tool, you can call it from the terminal.

MBP-Georgij:~ georguy$ mlmodelencoder your.mlmodel your_secret_key The file was encrypted successfully.

If the file is encrypted successfully, in your current folder (where you called mlmodelencoder) you’ll find a new encrypted version of your model — your.mlmodel.enc.

Great. We’re halfway there. All we have to do is decrypt the file at runtime on iOS. The function for this also looks pretty simple:

func decryptModel(at path: String, decryptionKey: String) throws -> Data? {
  guard let contentData = FileManager.default.contents(atPath: path) else {
    throw Error.noFileAtPath(path)
  }

  return try cryptor.decrypt(data: contentData, withPassword: decryptionKey)
}

Where cryptor is also the MLCryptor’s instance (you should reuse it on both platforms). Then you can save the decrypted data and use the MLModel API to work with it.

In the end

As I wrote above, all described approaches are suitable for different conditions. Choose one that best suits your goals. Or you can combine them if you’d like, for extra layers of obfuscation.

If you know of other methods, please contact me here @eigeorguy. And I hope in the near future, Apple provides us a native way to keep our models private.

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

wix banner square