From 4e06cadd1e3ba8c963a33ee9f2d3fcdc1c6ded00 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Thu, 2 May 2019 09:19:26 -0700 Subject: [PATCH] Updates the TensorFlow Lite iOS image classification docs to reflect the recent migration of the example app to the new `TensorFlowLiteSwift` CocoaPod. PiperOrigin-RevId: 246330896 --- tensorflow/lite/g3doc/guide/ios.md | 141 +++++----- .../g3doc/models/image_classification/ios.md | 249 +++++++++--------- 2 files changed, 197 insertions(+), 193 deletions(-) diff --git a/tensorflow/lite/g3doc/guide/ios.md b/tensorflow/lite/g3doc/guide/ios.md index 7a0e1867fd0..77aa64ca6fd 100644 --- a/tensorflow/lite/g3doc/guide/ios.md +++ b/tensorflow/lite/g3doc/guide/ios.md @@ -1,7 +1,7 @@ # iOS quickstart To get started with TensorFlow Lite on iOS, we recommend exploring the following -example. +example: iOS image classification example @@ -11,88 +11,89 @@ For an explanation of the source code, you should also read This example app uses [image classification](https://www.tensorflow.org/lite/models/image_classification/overview) -to continuously classify whatever it sees from the device's rear-facing camera. -The application must be run on an iOS device. - -Inference is performed using the TensorFlow Lite C++ API. The demo app -classifies frames in real-time, displaying the top most probable -classifications. It allows the user to choose between a floating point or +to continuously classify whatever it sees from the device's rear-facing camera, +displaying the top most probable classifications. It allows the user to choose +between a floating point or [quantized](https://www.tensorflow.org/lite/performance/post_training_quantization) -model, select the thread count, and decide whether to run on CPU, GPU, or via -[NNAPI](https://developer.android.com/ndk/guides/neuralnetworks). +model and select the number of threads to perform inference on. Note: Additional iOS applications demonstrating TensorFlow Lite in a variety of use cases are available in [Examples](https://www.tensorflow.org/lite/examples). -## Build in Xcode - -To build the example in Xcode, follow the instructions in -[README.md](https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/ios/README.md). - -## Create your own iOS app +## Add TensorFlow Lite to your Swift or Objective-C project +TensorFlow Lite offers native iOS libraries written in +[Swift](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/swift) +and +[Objective-C](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/objc). To get started quickly writing your own iOS code, we recommend using our -[iOS image classification example](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios) +[Swift image classification example](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios) as a starting point. -The following sections contain some useful information for working with -TensorFlow Lite on iOS. +The sections below walk you through the steps for adding TensorFlow Lite Swift +or Objective-C to your project: -### Use TensorFlow Lite from Objective-C and Swift +### CocoaPods developers -The example app provides an Objective-C wrapper on top of the C++ Tensorflow -Lite library. This wrapper is required because currently there is no -interoperability between Swift and C++. The wrapper is exposed to Swift via -bridging so that the Tensorflow Lite methods can be called from Swift. +In your `Podfile`, add the TensorFlow Lite pod. Then, run `pod install`: -The wrapper is located in -[TensorflowLiteWrapper](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorflowLiteWrapper). -It is not tightly coupled with the example code, so you can use it in your own -iOS apps. It exposes the following interface: +#### Swift -```objectivec -@interface TfliteWrapper : NSObject - -/** - This method initializes the TfliteWrapper with the specified model file. - */ -- (instancetype)initWithModelFileName:(NSString *)fileName; - -/** - This method initializes the interpreter of TensorflowLite library with the specified model file - that performs the inference. - */ -- (BOOL)setUpModelAndInterpreter; - -/** - This method gets a reference to the input tensor at an index. - */ -- (uint8_t *)inputTensorAtIndex:(int)index; - -/** - This method performs the inference by invoking the interpreter. - */ -- (BOOL)invokeInterpreter; - -/** - This method gets the output tensor at a specified index. - */ -- (uint8_t *)outputTensorAtIndex:(int)index; - -/** - This method sets the number of threads used by the interpreter to perform inference. - */ -- (void)setNumberOfThreads:(int)threadCount; - -@end +```ruby +use_frameworks! +pod 'TensorFlowLiteSwift' ``` -To use these files in your own iOS app, copy them into your Xcode project. +#### Objective-C -Note: When you add an Objective-C file to an existing Swift app (or vice versa), -Xcode will prompt you to create a *bridging header* file to expose the files to -Swift. In the example project, this file is named -[`ImageClassification-Bridging-Header.h`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorflowLiteWrapper/ImageClassification-Bridging-Header.h). -For more information, see Apple's -[Importing Objective-C into Swift](https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_objective-c_into_swift){: .external} -documentation. +```ruby +pod 'TensorFlowLiteObjC' +``` + +### Bazel developers + +In your `BUILD` file, add the `TensorFlowLite` dependency. + +#### Swift + +```python +swift_library( + deps = [ + "//tensorflow/lite/experimental/swift:TensorFlowLite", + ], +) +``` + +#### Objective-C + +```python +objc_library( + deps = [ + "//tensorflow/lite/experimental/objc:TensorFlowLite", + ], +) +``` + +### Importing the library + +For Swift files, import the TensorFlow Lite module: + +```swift +import TensorFlowLite +``` + +For Objective-C files, import the umbrella header: + +```objectivec +#import "TFLTensorFlowLite.h" +``` + +Or, the TensorFlow Lite module: + +```objectivec +@import TFLTensorFlowLite; +``` + +Note: If importing the Objective-C TensorFlow Lite module, `CLANG_ENABLE_MODULES` +must be set to `YES`. Additionally, for CocoaPods developers, `use_frameworks!` +must be specified in your `Podfile`. diff --git a/tensorflow/lite/g3doc/models/image_classification/ios.md b/tensorflow/lite/g3doc/models/image_classification/ios.md index feb172d2d72..fde965beb92 100644 --- a/tensorflow/lite/g3doc/models/image_classification/ios.md +++ b/tensorflow/lite/g3doc/models/image_classification/ios.md @@ -12,98 +12,93 @@ application ## Explore the code -We're now going to walk through the most important parts of the sample code. +The app is written entirely in Swift and uses the TensorFlow Lite +[Swift library](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/swift) +for performing image classification. -This example is written in both Swift and Objective-C. All application -functionality, image processing, and results formatting is developed in Swift. -Objective-C is used via -[bridging](https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_objective-c_into_swift) -to make the TensorFlow Lite C++ framework calls. +Note: Objective-C developers should use the TensorFlow Lite +[Objective-C library](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/objc). + +We're now going to walk through the most important parts of the sample code. ### Get camera input -The main logic of this app is in the Swift source file -[`ViewController.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ViewControllers/ViewController.swift). - -The app's main view is represented by the `ViewController` class, which we -extend with functionality from `CameraFeedManagerDelegate`, a class created to -handle a camera feed. To run inference on the feed, we implement the `didOutput` -method, which is called whenever a frame is available from the camera. +The app's main view is represented by the `ViewController` class in +[`ViewController.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ViewControllers/ViewController.swift), +which we extend with functionality from the `CameraFeedManagerDelegate` protocol +to process frames from a camera feed. To run inference on a given frame, we +implement the `didOutput` method, which is called whenever a frame is available +from the camera. Our implementation of `didOutput` includes a call to the `runModel` method of a `ModelDataHandler` instance. As we will see below, this class gives us access to -the TensorFlow Lite interpreter and the image classification model we are using. +the TensorFlow Lite `Interpreter` class for performing image classification. ```swift extension ViewController: CameraFeedManagerDelegate { func didOutput(pixelBuffer: CVPixelBuffer) { - - // Run the live camera pixelBuffer through TensorFlow to get the result let currentTimeMs = Date().timeIntervalSince1970 * 1000 - - guard (currentTimeMs - previousInferenceTimeMs) >= delayBetweenInferencesMs else { - return - } - + guard (currentTimeMs - previousInferenceTimeMs) >= delayBetweenInferencesMs else { return } previousInferenceTimeMs = currentTimeMs + + // Pass the pixel buffer to TensorFlow Lite to perform inference. result = modelDataHandler?.runModel(onFrame: pixelBuffer) + // Display results by handing off to the InferenceViewController. DispatchQueue.main.async { - let resolution = CGSize(width: CVPixelBufferGetWidth(pixelBuffer), height: CVPixelBufferGetHeight(pixelBuffer)) - - // Display results by handing off to the InferenceViewController self.inferenceViewController?.inferenceResult = self.result self.inferenceViewController?.resolution = resolution self.inferenceViewController?.tableView.reloadData() - } } ... ``` -### TensorFlow Lite wrapper - -The app uses TensorFlow Lite's C++ library via an Objective-C wrapper defined in -[`TfliteWrapper.h`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorFlowLiteWrapper/TfliteWrapper.h) -and -[`TfliteWrapper.mm`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorFlowLiteWrapper/TfliteWrapper.mm). - -This wrapper is required because currently there is no interoperability between -Swift and C++. The wrapper is exposed to Swift via bridging so that the -Tensorflow Lite methods can be called from Swift. - ### ModelDataHandler -The Swift class `ModelDataHandler`, defined by +The Swift class `ModelDataHandler`, defined in [`ModelDataHandler.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ModelDataHandler/ModelDataHandler.swift), handles all data preprocessing and makes calls to run inference on a given frame -through the TfliteWrapper. It then formats the inferences obtained and returns -the top N results for a successful inference. +using the TensorFlow Lite [`Interpreter`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/experimental/swift/Sources/Interpreter.swift). +It then formats the inferences obtained from invoking the `Interpreter` and +returns the top N results for a successful inference. The following sections show how this works. #### Initialization -The method `init` instantiates a `TfliteWrapper` and loads the supplied model -and labels files from disk. +The `init` method creates a new instance of the `Interpreter` and loads the +specified model and labels files from the app's main bundle. ```swift -init?(modelFileName: String, labelsFileName: String, labelsFileExtension: String) { +init?(modelFileInfo: FileInfo, labelsFileInfo: FileInfo, threadCount: Int = 1) { + let modelFilename = modelFileInfo.name - // Initializes TFliteWrapper and based on the setup result of interpreter, initializes the object of this class - self.tfLiteWrapper = TfliteWrapper(modelFileName: modelFileName) - guard self.tfLiteWrapper.setUpModelAndInterpreter() else { + // Construct the path to the model file. + guard let modelPath = Bundle.main.path( + forResource: modelFilename, + ofType: modelFileInfo.extension + ) else { + print("Failed to load the model file with name: \(modelFilename).") return nil } - super.init() - - tfLiteWrapper.setNumberOfThreads(threadCount) - - // Opens and loads the classes listed in the labels file - loadLabels(fromFileName: labelsFileName, fileExtension: labelsFileExtension) + // Specify the options for the `Interpreter`. + self.threadCount = threadCount + var options = InterpreterOptions() + options.threadCount = threadCount + options.isErrorLoggingEnabled = true + do { + // Create the `Interpreter`. + interpreter = try Interpreter(modelPath: modelPath, options: options) + } catch let error { + print("Failed to create the interpreter with error: \(error.localizedDescription)") + return nil + } + // Load the classes listed in the labels file. + loadLabels(fileInfo: labelsFileInfo) } ``` @@ -112,110 +107,118 @@ init?(modelFileName: String, labelsFileName: String, labelsFileExtension: String The method `runModel` accepts a `CVPixelBuffer` of camera data, which can be obtained from the `didOutput` method defined in `ViewController`. -We crop the image, call `CVPixelBufferLockBaseAddress` to prepare the buffer to -be read by the CPU, and then create an input tensor using the TensorFlow Lite -wrapper: - -```swift -guard let tensorInputBaseAddress = tfLiteWrapper.inputTensor(at: 0) else { - return nil -} -``` +We crop the image to the size that the model was trained on. For example, +`224x224` for the MobileNet v1 model. The image buffer contains an encoded color for each pixel in `BGRA` format -(where `A` represents Alpha, or transparency), and our model expects it in `RGB` -format. We now step through the buffer four bytes at a time, copying the three -bytes we care about (`R`, `G`, and `B`) to the input tensor. - -Note: Since we are using a quantized model, we can directly use the `UInt8` -values from the buffer. If we were using a float model, we would have to convert -them to floating point by dividing by 255. +(where `A` represents Alpha, or transparency). Our model expects the format to +be `RGB`, so we use the following helper method to remove the alpha component +from the image buffer to get the `RGB` data representation: ```swift -let inputImageBaseAddress = sourceStartAddrss.assumingMemoryBound(to: UInt8.self) - -for y in 0...wantedInputHeight - 1 { - let tensorInputRow = tensorInputBaseAddress.advanced(by: (y * wantedInputWidth * wantedInputChannels)) - let inputImageRow = inputImageBaseAddress.advanced(by: y * wantedInputWidth * imageChannels) - - for x in 0...wantedInputWidth - 1 { - - let out_pixel = tensorInputRow.advanced(by: x * wantedInputChannels) - let in_pixel = inputImageRow.advanced(by: x * imageChannels) - - var b = 2 - for c in 0...(wantedInputChannels) - 1 { - - // We are reversing the order of pixels since the source pixel format is BGRA, but the model requires RGB format. - out_pixel[c] = in_pixel[b] - b = b - 1 - } +private let alphaComponent = (baseOffset: 4, moduloRemainder: 3) +private func rgbDataFromBuffer( + _ buffer: CVPixelBuffer, + byteCount: Int, + isModelQuantized: Bool +) -> Data? { + CVPixelBufferLockBaseAddress(buffer, .readOnly) + defer { CVPixelBufferUnlockBaseAddress(buffer, .readOnly) } + guard let mutableRawPointer = CVPixelBufferGetBaseAddress(buffer) else { + return nil } + let count = CVPixelBufferGetDataSize(buffer) + let bufferData = Data(bytesNoCopy: mutableRawPointer, count: count, deallocator: .none) + var rgbBytes = [UInt8](repeating: 0, count: byteCount) + var index = 0 + for component in bufferData.enumerated() { + let offset = component.offset + let isAlphaComponent = (offset % alphaComponent.baseOffset) == alphaComponent.moduloRemainder + guard !isAlphaComponent else { continue } + rgbBytes[index] = component.element + index += 1 + } + if isModelQuantized { return Data(bytes: rgbBytes) } + return Data(copyingBufferOf: rgbBytes.map { Float($0) / 255.0 }) } ``` #### Run inference -Running inference is a simple call to `tfLiteWrapper.invokeInterpreter()`. The -result of this synchronous call can be obtained by calling -`tfLiteWrapper.outputTensor()`. +Here's the code for getting the `RGB` data representation of the pixel buffer, +copying that data to the input +[`Tensor`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/experimental/swift/Sources/Tensor.swift), +and running inference by invoking the `Interpreter`: ```swift -guard tfLiteWrapper.invokeInterpreter() else { - return nil -} +let outputTensor: Tensor +do { + // Allocate memory for the model's input `Tensor`s. + try interpreter.allocateTensors() + let inputTensor = try interpreter.input(at: 0) -guard let outputTensor = tfLiteWrapper.outputTensor(at: 0) else { - return nil + // Remove the alpha component from the image buffer to get the RGB data. + guard let rgbData = rgbDataFromBuffer( + thumbnailPixelBuffer, + byteCount: batchSize * inputWidth * inputHeight * inputChannels, + isModelQuantized: inputTensor.dataType == .uInt8 + ) else { + print("Failed to convert the image buffer to RGB data.") + return + } + + // Copy the RGB data to the input `Tensor`. + try interpreter.copy(rgbData, toInputAt: 0) + + // Run inference by invoking the `Interpreter`. + try interpreter.invoke() + + // Get the output `Tensor` to process the inference results. + outputTensor = try interpreter.output(at: 0) +} catch let error { + print("Failed to invoke the interpreter with error: \(error.localizedDescription)") + return } ``` #### Process results -The `getTopN` method, also declared in `ModelDataHandler.swift`, interprets the -contents of the output tensor. It returns a list of the top N predictions, -ordered by confidence. - -The output tensor contains one `UInt8` value per class label, with a value -between 0 and 255 corresponding to a confidence of 0 to 100% that each label is -present in the image. - -First, the results are mapped into an array of `Inference` instances, each with -a `confidence` between 0 and 1 and a `className` representing the label. +If the model is quantized, the output `Tensor` contains one `UInt8` value per +class label. Dequantize the results so the values are floats, ranging from 0.0 +to 1.0, where each value represents the confidence that a label is present in +the image: ```swift -for i in 0...predictionSize - 1 { - let value = Double(prediction[i]) / 255.0 +guard let quantization = outputTensor.quantizationParameters else { + print("No results returned because the quantization values for the output tensor are nil.") + return +} - guard i < labels.count else { - continue - } +// Get the quantized results from the output tensor's `data` property. +let quantizedResults = [UInt8](outputTensor.data) - let inference = Inference(confidence: value, className: labels[i]) - resultsArray.append(inference) +// Dequantize the results using the quantization values. +let results = quantizedResults.map { + quantization.scale * Float(Int($0) - quantization.zeroPoint) } ``` -Next, the results are sorted, and we return the top `N` (where N is -`resultCount`). +Next, the results are sorted to get the top `N` results (where `N` is +`resultCount`): ```swift -resultsArray.sort { (first, second) -> Bool in - return first.confidence > second.confidence -} +// Create a zipped array of tuples [(labelIndex: Int, confidence: Float)]. +let zippedResults = zip(labels.indices, results) -guard resultsArray.count > resultCount else { - return resultsArray -} -let finalArray = resultsArray[0.. $1.1 }.prefix(resultCount) -return Array(finalArray) +// Get the top N `Inference` results. +let topNInferences = sortedResults.map { result in Inference(confidence: result.1, label: labels[result.0]) } ``` ### Display results The file [`InferenceViewController.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ViewControllers/InferenceViewController.swift) -defines the app's UI. - -A `UITableView` instance, `tableView`, is used to display the results. +defines the app's UI. A `UITableView` is used to display the results.