diff --git a/tensorflow/lite/g3doc/guide/android.md b/tensorflow/lite/g3doc/guide/android.md index 4b2f38a5d32..68f1eb5a387 100644 --- a/tensorflow/lite/g3doc/guide/android.md +++ b/tensorflow/lite/g3doc/guide/android.md @@ -1,146 +1,87 @@ # Android quickstart -An example Android application using TensorFLow Lite is available -[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo). -The demo is a sample camera app that classifies images continuously -using either a quantized Mobilenet model or a floating point Inception-v3 model. -To run the demo, a device running Android 5.0 ( API 21) or higher is required. +To get started with TensorFlow Lite on Android, we recommend exploring the +following example. -In the demo app, inference is done using the TensorFlow Lite Java API. The demo -app classifies frames in real-time, displaying the top most probable -classifications. It also displays the time taken to detect the object. +Android +image classification example -There are three ways to get the demo app to your device: +For an explanation of the source code, you should also read +[TensorFlow Lite Android image classification](https://www.tensorflow.org/lite/models/image_classification/android). -* Download the [prebuilt binary APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk). -* Use Android Studio to build the application. -* Download the source code for TensorFlow Lite and the demo and build it using - bazel. +This example app uses +[image classification](https://www.tensorflow.org/lite/models/image_classification/overview) +to continuously classify whatever it sees from the device's rear-facing camera. +The application can run either on device or emulator. +Inference is performed using the TensorFlow Lite Java API. The demo app +classifies frames in real-time, displaying the top most probable +classifications. It allows the user to choose between a floating point or +[quantized](https://www.tensorflow.org/lite/performance/post_training_quantization) +model, select the thread count, and decide whether to run on CPU, GPU, or via +[NNAPI](https://developer.android.com/ndk/guides/neuralnetworks). -## Download the pre-built binary +Note: Additional Android applications demonstrating TensorFlow Lite in a variety +of use cases are available in +[Examples](https://www.tensorflow.org/lite/examples). -The easiest way to try the demo is to download the -[pre-built binary APK](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk) +## Build in Android Studio -Once the APK is installed, click the app icon to start the program. The first -time the app is opened, it asks for runtime permissions to access the device -camera. The demo app opens the back-camera of the device and recognizes objects -in the camera's field of view. At the bottom of the image (or at the left -of the image if the device is in landscape mode), it displays top three objects -classified and the classification latency. +To build the example in Android Studio, follow the instructions in +[README.md](https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/android/README.md). +## Create your own Android app -## Build in Android Studio with TensorFlow Lite AAR from JCenter +To get started quickly writing your own Android code, we recommend using our +[Android image classification example](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android) +as a starting point. -Use Android Studio to try out changes in the project code and compile the demo -app: +The following sections contain some useful information for working with +TensorFlow Lite on Android. -* Install the latest version of - [Android Studio](https://developer.android.com/studio/index.html). -* Make sure the Android SDK version is greater than 26 and NDK version is greater - than 14 (in the Android Studio settings). -* Import the `tensorflow/lite/java/demo` directory as a new - Android Studio project. -* Install all the Gradle extensions it requests. +### Use the TensorFlow Lite AAR from JCenter -Now you can build and run the demo app. +To use TensorFlow Lite in your Android app, we recommend using the +[TensorFlow Lite AAR hosted at JCenter](https://bintray.com/google/tensorflow/tensorflow-lite). -The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/lite/java/demo/app/src/main/assets/`. +You can specify this in your `build.gradle` dependencies as follows: -Some additional details are available on the -[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md). - -### Using other models - -To use a different model: -* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip). -* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory. -* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)
- from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`
- to: `classifier = new ImageClassifierFloatInception(getActivity());`. - - -## Build TensorFlow Lite and the demo app from source - -### Clone the TensorFlow repo - -```sh -git clone https://github.com/tensorflow/tensorflow +```build +dependencies { + implementation 'org.tensorflow:tensorflow-lite:0.0.0-nightly' +} ``` -### Install Bazel +This AAR includes binaries for all of the +[Android ABIs](https://developer.android.com/ndk/guides/abis). You can reduce +the size of your application's binary by only including the ABIs you need to +support. -If `bazel` is not installed on your system, see -[Installing Bazel](https://bazel.build/versions/master/docs/install.html). +We recommend most developers omit the `x86`, `x86_64`, and `arm32` ABIs. This +can be achieved with the following Gradle configuration, which specifically +includes only `armeabi-v7a` and `arm64-v8a`, which should cover most modern +Android devices. -Note: Bazel does not currently support Android builds on Windows. Windows users -should download the -[prebuilt binary](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk). - -### Install Android NDK and SDK - -The Android NDK is required to build the native (C/C++) TensorFlow Lite code. The -current recommended version is *14b* and can be found on the -[NDK Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads) -page. - -The Android SDK and build tools can be -[downloaded separately](https://developer.android.com/tools/revisions/build-tools.html) -or used as part of -[Android Studio](https://developer.android.com/studio/index.html). To build the -TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on -devices with API >= 21). - -In the root of the TensorFlow repository, update the `WORKSPACE` file with the -`api_level` and location of the SDK and NDK. If you installed it with -Android Studio, the SDK path can be found in the SDK manager. The default NDK -path is:`{SDK path}/ndk-bundle.` For example: - -``` -android_sdk_repository ( - name = "androidsdk", - api_level = 23, - build_tools_version = "23.0.2", - path = "/home/xxxx/android-sdk-linux/", -) - -android_ndk_repository( - name = "androidndk", - path = "/home/xxxx/android-ndk-r10e/", - api_level = 19, -) +```build +android { + defaultConfig { + ndk { + abiFilters 'armeabi-v7a', 'arm64-v8a' + } + } +} ``` -Some additional details are available on the -[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/README.md). +To learn more about `abiFilters`, see +[`NdkOptions`](https://google.github.io/android-gradle-dsl/current/com.android.build.gradle.internal.dsl.NdkOptions.html) +in the Android Gradle documentation. -### Build the source code +### Build TensorFlow Lite locally -To build the demo app, run `bazel`: +In some cases, you might wish to use a local build of TensorFlow Lite. For +example, you may be building a custom binary that includes +[operations selected from TensorFlow](https://www.tensorflow.org/lite/guide/ops_select). -``` -bazel build --cxxopt=--std=c++11 //tensorflow/lite/java/demo/app/src/main:TfLiteCameraDemo -``` - -Caution: Because of an bazel bug, we only support building the Android demo app -within a Python 2 environment. - - -## About the demo - -The demo app is resizing each camera image frame (224 width * 224 height) to -match the quantized MobileNets model (299 * 299 for Inception-v3). The resized -image is converted—row by row—into a -[ByteBuffer](https://developer.android.com/reference/java/nio/ByteBuffer.html). -Its size is 1 * 224 * 224 * 3 bytes, where 1 is the number of images in a batch. -224 * 224 (299 * 299) is the width and height of the image. 3 bytes represents -the 3 colors of a pixel. - -This demo uses the TensorFlow Lite Java inference API -for models which take a single input and provide a single output. This outputs a -two-dimensional array, with the first dimension being the category index and the -second dimension being the confidence of classification. Both models have 1001 -unique categories and the app sorts the probabilities of all the categories and -displays the top three. The model file must be downloaded and bundled within the -assets directory of the app. +In this case, follow the +[custom AAR build instructions](https://www.tensorflow.org/lite/guide/ops_select#android_aar) +to create your own AAR and include it in your app. diff --git a/tensorflow/lite/g3doc/guide/ios.md b/tensorflow/lite/g3doc/guide/ios.md index 3565ce71df3..ececb253dc7 100644 --- a/tensorflow/lite/g3doc/guide/ios.md +++ b/tensorflow/lite/g3doc/guide/ios.md @@ -1,229 +1,98 @@ # iOS quickstart -This tutorial provides a simple iOS mobile application to classify images using -the iOS device camera. In this tutorial, you will download the demo application -from the Tensorflow repository, build it on your computer, and install it on -your iOS Device. You will also learn how to customize the application to suit -your requirements. +To get started with TensorFlow Lite on iOS, we recommend exploring the following +example. -## Prerequisites +iOS +image classification example -* You must have [Xcode](https://developer.apple.com/xcode/) installed and have - a valid Apple Developer ID, and have an iOS device set up and linked to your - developer account with all of the appropriate certificates. For these - instructions, we assume that you have already been able to build and deploy - an app to an iOS device with your current developer environment. +For an explanation of the source code, you should also read +[TensorFlow Lite iOS image classification](https://www.tensorflow.org/lite/models/image_classification/ios). -* The demo app requires a camera and must be executed on a real iOS device. - You can build it and run with the iPhone Simulator but it won't have any - camera information to classify. +This example app uses +[image classification](https://www.tensorflow.org/lite/models/image_classification/overview) +to continuously classify whatever it sees from the device's rear-facing camera. +The application must be run on an iOS device. -* You don't need to build the entire TensorFlow library to run the demo, but - you will need to clone the TensorFlow repository if you haven't already: +Inference is performed using the TensorFlow Lite C++ API. The demo app +classifies frames in real-time, displaying the top most probable +classifications. It allows the user to choose between a floating point or +[quantized](https://www.tensorflow.org/lite/performance/post_training_quantization) +model, select the thread count, and decide whether to run on CPU, GPU, or via +[NNAPI](https://developer.android.com/ndk/guides/neuralnetworks). - git clone https://github.com/tensorflow/tensorflow - cd tensorflow +Note: Additional iOS applications demonstrating TensorFlow Lite in a variety of +use cases are available in [Examples](https://www.tensorflow.org/lite/examples). -* You'll also need the Xcode command-line tools: +## Build in Xcode - xcode-select --install +To build the example in Xcode, follow the instructions in +[README.md](https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/ios/README.md). - If this is a new install, you will need to run the Xcode application once to - agree to the license before continuing. +## Create your own Android app -* Install CocoaPods if you don't have it: +To get started quickly writing your own iOS code, we recommend using our +[iOS image classification example](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios) +as a starting point. - sudo gem install cocoapods +The following sections contain some useful information for working with +TensorFlow Lite on iOS. -### Step 1. Clone the TensorFlow source code +### Use TensorFlow Lite from Objective-C and Swift -First, we clone the GitHub repository on the computer in a folder to get the -demo application. +The example app provides an Objective-C wrapper on top of the C++ Tensorflow +Lite library. This wrapper is required because currently there is no +interoperability between Swift and C++. The wrapper is exposed to Swift via +bridging so that the Tensorflow Lite methods can be called from Swift. -``` -git clone https://github.com/tensorflow/tensorflow +The wrapper is located in +[TensorflowLiteWrapper](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorflowLiteWrapper). +It is not tightly coupled with the example code, so you can use it in your own +iOS apps. It exposes the following interface: + +```objectivec +@interface TfliteWrapper : NSObject + +/** + This method initializes the TfliteWrapper with the specified model file. + */ +- (instancetype)initWithModelFileName:(NSString *)fileName; + +/** + This method initializes the interpreter of TensorflowLite library with the specified model file + that performs the inference. + */ +- (BOOL)setUpModelAndInterpreter; + +/** + This method gets a reference to the input tensor at an index. + */ +- (uint8_t *)inputTensorAtIndex:(int)index; + +/** + This method performs the inference by invoking the interpreter. + */ +- (BOOL)invokeInterpreter; + +/** + This method gets the output tensor at a specified index. + */ +- (uint8_t *)outputTensorAtIndex:(int)index; + +/** + This method sets the number of threads used by the interpreter to perform inference. + */ +- (void)setNumberOfThreads:(int)threadCount; + +@end ``` -### Step 2. Download required dependencies +To use these files in your own iOS app, copy them into your Xcode project. -Execute the shell script to download the model files used by the demo app (this -is done from inside the cloned directory): - -``` - tensorflow/lite/examples/ios/download_models.sh -``` - -Run the following command to install TensorFlow Lite pod: - -``` - cd tensorflow/lite/examples/ios/camera - pod install -``` - -If you have installed this pod before and that command doesn't work, try - -``` - pod repo update -``` - -### Step 3. Build the XCode project - -Open the `tflite_camera_example.xcworkspace` project file generated in the last -step: - -``` - open tflite_camera_example.xcworkspace -``` - -Under `Project navigator -> tflite_camera_example -> Targets -> -tflite_camera_example -> General` change the bundle identifier by pre-pending -your name: - -![pre-pend your name to the bundle identifier](../images/ios/bundle_identifier.png) - -Plug in your iOS device. Note the app must be executed with a real device with -camera. Select the iOS device from the drop-down menu. - -![Device selection](../images/ios/device_selection.png) - -Click the "Run" button to build and run the app - -![Build and execute](../images/ios/build_and_execute.png) - -Note that as mentioned earlier, you must already have a device set up and linked -to your Apple Developer account in order to deploy the app on a device. - -You'll have to grant permissions for the app to use the device's camera. Point -the camera at various objects and enjoy seeing how the model classifies things! - -## Understanding iOS App Code - -### Get camera input - -The main logic of this app is in the Objective C++ source file -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. - -The `setupAVCapture` method constructs a `AVCaptureSession` and set itself as a -delegate. The `captureOutput:didOutputSampleBuffer:fromConnection:` method is -called for every captured frame. It calls `runModelOnFrame` to run the model for -every frame. - -### Create an interpreter - -To create the interpreter, we need to load the model file. The following code -will load a model and create an interpreter. - -``` -model = tflite::FlatBufferModel::BuildFromFile([graph_path UTF8String]); -``` - -Behind the scenes, the model is loaded as a memory-mapped file. It offers faster -load times and reduce the dirty pages in memory. - -Construct a `BuiltinOpResolver` to use the TensorFlow Lite buildin ops. Then, -create the interpreter object using `InterpreterBuilder` that takes the model -file as argument as shown below. - -``` -tflite::ops::builtin::BuiltinOpResolver resolver; -tflite::InterpreterBuilder(*model, resolver)(&interpreter); -``` - -### Obtain the input buffer - -By default, the app uses quantized model since it's smaller and faster. The -buffer is a raw pointer to an array of 8 bit unsigned integers (`uint8_t`). The -following code obtains the input buffer from the interpreter: - -``` -// Get the index of first input tensor. -int input_tensor_index = interpreter->inputs()[0]; -// Get the pointer to the input buffer. -uint8_t* buffer = interpreter->typed_tensor(input_tensor_index); -``` - -Throughout this document, it's assumed a quantized model is used. - -### Pre-process of bitmap image - -The MobileNet model we're using takes 224x224x3 inputs, where the dimensions are -width, height, and colors (RGB). The images returned from `AVCaptureSession` is -bigger, and has 4 color channels (RGBA). - -Many image classification models (like MobileNet) take fixe-sized inputs. It's -required to scale or crop the image before feeding it into the model, and change -the channels from RGBA to RGB. - -The code to pre-process the images is in `ProcessInputWithQuantizedModel` -function in -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. It's a -simple implementation for nearest neighbor color sampling, and it only copies -the first 3 bytes for each pixel. - -``` -void ProcessInputWithQuantizedModel( - uint8_t* input, uint8_t* output, int image_width, int image_height, int image_channels) { - for (int y = 0; y < wanted_input_height; ++y) { - uint8_t* out_row = output + (y * wanted_input_width * wanted_input_channels); - for (int x = 0; x < wanted_input_width; ++x) { - const int in_x = (y * image_width) / wanted_input_width; - const int in_y = (x * image_height) / wanted_input_height; - uint8_t* in_pixel = input + (in_y * image_width * image_channels) + (in_x * image_channels); - uint8_t* out_pixel = out_row + (x * wanted_input_channels); - for (int c = 0; c < wanted_input_channels; ++c) { - out_pixel[c] = in_pixel[c]; - } - } - } -} -``` - -Note the code is preprocessing and preparing the model input from the camera -data. Therefore the first parameter `input` should be the camera buffer. The -second parameter `output` should be the buffer of model input. - -### Run inference and obtain output buffer - -After preprocessing and filling the data into the input buffer of the -interpreter, it's really easy to run the interpreter: - -``` -if (interpreter->Invoke() != kTfLiteOk) { - NSLog("Failed to invoke!"); -} -``` - -The result is stored in the output tensor buffer of the interpreter. The -following code obtains the pointer to the buffer: - -``` -// Get the index of first output tensor. -const int output_tensor_index = interpreter->outputs()[0]; -// Get the pointer to the output buffer. -uint8_t* buffer = interpreter->typed_tensor(output_tensor_index); -``` - -### Post-process values - -The output buffer contains an array of `uint8_t`, and the value range is 0-255. -We need to convert the value to float to get the probabilities with value range -0.0-1.0. The formula of the quantization value mapping is: - - float_value = (quantized_value - zero_point) * scale - -The following code converts quantized values back to float values, using the -quantizaiton parameters in tensors: - -``` -uint8_t* quantized_output = interpreter->typed_output_tensor(0); -int32_t zero_point = input_tensor->params.zero_point; -float scale = input_tensor->params.scale; -float output[output_size]; -for (int i = 0; i < output_size; ++i) { - output[i] = (quantized_output[i] - zero_point) * scale; -} -``` - -Finally, we find the best set of classifications by storing them in a priority -queue based on their confidence scores. See the `GetTopN` function in -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. +Note: When you add an Objective-C file to an existing Swift app (or vice versa), +Xcode will prompt you to create a *bridging header* file to expose the files to +Swift. In the example project, this file is named +[`ImageClassification-Bridging-Header.h`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorflowLiteWrapper/ImageClassification-Bridging-Header.h). +For more information, see Apple's +[Importing Objective-C into Swift](https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_objective-c_into_swift){: .external} +documentation. diff --git a/tensorflow/lite/g3doc/models/image_classification/android.md b/tensorflow/lite/g3doc/models/image_classification/android.md index 5cca2217008..51e354e1834 100644 --- a/tensorflow/lite/g3doc/models/image_classification/android.md +++ b/tensorflow/lite/g3doc/models/image_classification/android.md @@ -1,207 +1,317 @@ -# TensorFlow Lite Android Image Classifier App Example +# TensorFlow Lite Android image classification example -This tutorial provides a simple Android mobile application to classify images -using the Android device camera. In this tutorial, you will download the demo -application from the Tensorflow examples repository, build it on your computer, -and install it on your Android device. You will also learn how to customize the -application to suit your requirements. +This document walks through the code of a simple Android mobile application that +demonstrates [image classification](overview.md) using the device camera. -### Prerequisites +The application code is located in the +[Tensorflow examples](https://github.com/tensorflow/examples) repository, along +with instructions for building and deploying the app. -* Android Studio 3.2 (installed on a Linux, Mac or Windows machine) +Example +application -* Android device +## Explore the code -* USB cable (to connect Android device to your computer) - -### Step 1. Clone the TensorFlow source code - -Clone the TensorFlow examples GitHub repository to your computer to get the demo -application. - -``` - -git clone https://github.com/tensorflow/examples - -``` - -Open the TensorFlow source code in Android Studio. To do this, open Android -Studio and select `Open an existing project` setting the folder to -`examples/lite/examples/image_classification/android` - - - -This folder contains the demo application for image classification, object -detection, and speech hotword detection. - -### Step 2. Build the Android Studio project - -Select `Build -> Make Project` and check that the project builds -successfully. You will need Android SDK configured in the settings. You'll need -at least SDK version 23. The gradle file will prompt you to download any missing -libraries. - - - - - -#### TensorFlow Lite AAR from JCenter: - -Note that the `build.gradle` is configured to use TensorFlow Lite's nightly -build. - -If you see a build error related to compatibility with Tensorflow Lite's Java -API (example: method X is undefined for type Interpreter), there has likely been -a backwards compatible change to the API. You will need to pull new app code -that's compatible with the nightly build by running `git pull`. - -### Step 3. Install and run the app - -Connect the Android device to the computer and be sure to approve any ADB -permission prompts that appear on your phone. Select `Run -> Run app.` Select -the deployment target in the connected devices to the device on which the app will -be installed. This will install the app on the device. - - - - - - - - - -To test the app, open the app called `TFL Classify` on your device. When you run -the app the first time, the app will request permission to access the camera. -Re-installing the app may require you to uninstall the previous installations. - -## Understanding Android App Code +We're now going to walk through the most important parts of the sample code. ### Get camera input This mobile application gets the camera input using the functions defined in the -file CameraActivity.java in the folder -`examples/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/CameraActivity.java.` -This file depends on `AndroidManifest.xml` in the folder -`examples/lite/examples/image_classification/android/app/src/main` to set the -camera orientation. +file +[`CameraActivity.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/CameraActivity.java). +This file depends on +[`AndroidManifest.xml`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/AndroidManifest.xml) +to set the camera orientation. -### Pre-process bitmap image +`CameraActivity` also contains code to capture user preferences from the UI and +make them available to other classes via convenience methods. -The mobile application code that pre-processes the images and runs inference is -in -`examples/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/tflite/Classifier.java.` -Here, we take the input camera bitmap image and convert it to a Bytebuffer -format for efficient processing. We pre-allocate the memory for ByteBuffer -object based on the image dimensions because Bytebuffer objects can't infer the -object shape. - -``` -c.imgData = -ByteBuffer.allocateDirect( DIM_BATCH_SIZE * DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * -DIM_PIXEL_SIZE); -c.imgData.order(ByteOrder.nativeOrder()); +```java +model = Model.valueOf(modelSpinner.getSelectedItem().toString().toUpperCase()); +device = Device.valueOf(deviceSpinner.getSelectedItem().toString()); +numThreads = Integer.parseInt(threadsTextView.getText().toString().trim()); ``` -While running the application, we pre-process the incoming bitmap images from the -camera to a Bytebuffer. Since this model is quantized 8-bit, we will put a -single byte for each channel. `imgData` will contain an encoded `Color` for each -pixel in ARGB format, so we need to mask the least significant 8 bits to get -blue, and next 8 bits to get green and next 8 bits to get blue, and we have an -opaque image so alpha can be ignored. +### Classifier +The file +[`Classifier.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/tflite/Classifier.java) +contains most of the complex logic for processing the camera input and running +inference. + +Two subclasses of the file exist, in +[`ClassifierFloatMobileNet.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/tflite/ClassifierFloatMobileNet.java) +and +[`ClassifierQuantizedMobileNet.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/tflite/ClassifierQuantizedMobileNet.java), +to demonstrate the use of both floating point and +[quantized](https://www.tensorflow.org/lite/performance/post_training_quantization) +models. + +The `Classifier` class implements a static method, `create`, which is used to +instantiate the appropriate subclass based on the supplied model type (quantized +vs floating point). + +#### Load model and create interpreter + +To perform inference, we need to load a model file and instantiate an +`Interpreter`. This happens in the constructor of the `Classifier` class, along +with loading the list of class labels. Information about the device type and +number of threads is used to configure the `Interpreter` via the +`Interpreter.Options` instance passed into its constructor. Note how that in the +case of a GPU being available, a +[`Delegate`](https://www.tensorflow.org/lite/performance/gpu) is created using +`GpuDelegateHelper`. + +```java +protected Classifier(Activity activity, Device device, int numThreads) throws IOException { + tfliteModel = loadModelFile(activity); + switch (device) { + case NNAPI: + tfliteOptions.setUseNNAPI(true); + break; + case GPU: + gpuDelegate = GpuDelegateHelper.createGpuDelegate(); + tfliteOptions.addDelegate(gpuDelegate); + break; + case CPU: + break; + } + tfliteOptions.setNumThreads(numThreads); + tflite = new Interpreter(tfliteModel, tfliteOptions); + labels = loadLabelList(activity); +... ``` - imgData.rewind(); - bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); - // Convert the image to floating point. - int pixel = 0; - for (int i = 0; i < DIM_IMG_SIZE_X; ++i) { - for (int j = 0; j < DIM_IMG_SIZE_Y; ++j) { - final int val = intValues[pixel++]; - imgData.put((byte) ((val >> 16) & 0xFF)); - imgData.put((byte) ((val >> 8) & 0xFF)); - imgData.put((byte) (val & 0xFF)); - } + +For Android devices, we recommend pre-loading and memory mapping the model file +to offer faster load times and reduce the dirty pages in memory. The method +`loadModelFile` does this, returning a `MappedByteBuffer` containing the model. + +```java +private MappedByteBuffer loadModelFile(Activity activity) throws IOException { + AssetFileDescriptor fileDescriptor = activity.getAssets().openFd(getModelPath()); + FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor()); + FileChannel fileChannel = inputStream.getChannel(); + long startOffset = fileDescriptor.getStartOffset(); + long declaredLength = fileDescriptor.getDeclaredLength(); + return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength); +} +``` + +Note: If your model file is compressed then you will have to load the model as a +`File`, as it cannot be directly mapped and used from memory. + +The `MappedByteBuffer` is passed into the `Interpreter` constructor, along with +an `Interpreter.Options` object. This object can be used to configure the +interpreter, for example by setting the number of threads (`.setNumThreads(1)`) +or enabling [NNAPI](https://developer.android.com/ndk/guides/neuralnetworks) +(`.setUseNNAPI(true)`). + +#### Pre-process bitmap image + +Next in the `Classifier` constructor, we take the input camera bitmap image and +convert it to a `ByteBuffer` format for efficient processing. We pre-allocate +the memory for the `ByteBuffer` object based on the image dimensions because +Bytebuffer objects can't infer the object shape. + +The `ByteBuffer` represents the image as a 1D array with three bytes per channel +(red, green, and blue). We call `order(ByteOrder.nativeOrder())` to ensure bits +are stored in the device's native order. + +```java +imgData = + ByteBuffer.allocateDirect( + DIM_BATCH_SIZE + * getImageSizeX() + * getImageSizeY() + * DIM_PIXEL_SIZE + * getNumBytesPerChannel()); +imgData.order(ByteOrder.nativeOrder()); +``` + +The code in `convertBitmapToByteBuffer` pre-processes the incoming bitmap images +from the camera to this `ByteBuffer`. It calls the method `addPixelValue` to add +each set of pixel values to the `ByteBuffer` sequentially. + +```java +imgData.rewind(); +bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); +// Convert the image to floating point. +int pixel = 0; +for (int i = 0; i < getImageSizeX(); ++i) { + for (int j = 0; j < getImageSizeY(); ++j) { + final int val = intValues[pixel++]; + addPixelValue(val); + } +} +``` + +In `ClassifierQuantizedMobileNet`, `addPixelValue` is overridden to put a single +byte for each channel. The bitmap contains an encoded color for each pixel in +ARGB format, so we need to mask the least significant 8 bits to get blue, and +next 8 bits to get green and next 8 bits to get blue. Since we have an opaque +image, alpha can be ignored. + +```java +@Override +protected void addPixelValue(int pixelValue) { + imgData.put((byte) ((pixelValue >> 16) & 0xFF)); + imgData.put((byte) ((pixelValue >> 8) & 0xFF)); + imgData.put((byte) (pixelValue & 0xFF)); +} +``` + +For `ClassifierFloatMobileNet`, we must provide a floating point number for each +channel where the value is between `0` and `1`. To do this, we mask out each +color channel as before, but then divide each resulting value by `255.f`. + +```java +@Override +protected void addPixelValue(int pixelValue) { + imgData.putFloat(((pixelValue >> 16) & 0xFF) / 255.f); + imgData.putFloat(((pixelValue >> 8) & 0xFF) / 255.f); + imgData.putFloat((pixelValue & 0xFF) / 255.f); +} +``` + +#### Run inference + +The method that runs inference, `runInference`, is implemented by each subclass +of `Classifier`. In `ClassifierQuantizedMobileNet`, the method looks as follows: + +```java +protected void runInference() { + tflite.run(imgData, labelProbArray); +} +``` + +The output of the inference is stored in a byte array `labelProbArray`, which is +allocated in the subclass's constructor. It consists of a single outer element, +containing one innner element for each label in the classification model. + +To run inference, we call `run()` on the interpreter instance, passing the input +and output buffers as arguments. + +#### Recognize image + +Rather than call `runInference` directly, the method `recognizeImage` is used. +It accepts a bitmap, runs inference, and returns a sorted `List` of +`Recognition` instances, each corresponding to a label. The method will return a +number of results bounded by `MAX_RESULTS`, which is 3 by default. + +`Recognition` is a simple class that contains information about a specific +recognition result, including its `title` and `confidence`. + +A `PriorityQueue` is used for sorting. Each `Classifier` subclass has a +`getNormalizedProbability` method, which is expected to return a probability +between 0 and 1 of a given class being represented by the image. + +```java +PriorityQueue pq = + new PriorityQueue( + 3, + new Comparator() { + @Override + public int compare(Recognition lhs, Recognition rhs) { + // Intentionally reversed to put high confidence at the head of the queue. + return Float.compare(rhs.getConfidence(), lhs.getConfidence()); + } + }); +for (int i = 0; i < labels.size(); ++i) { + pq.add( + new Recognition( + "" + i, + labels.size() > i ? labels.get(i) : "unknown", + getNormalizedProbability(i), + null)); +} +``` + +### Display results + +The classifier is invoked and inference results are displayed by the +`processImage()` function in +[`ClassifierActivity.java`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/ClassifierActivity.java). + +`ClassifierActivity` is a subclass of `CameraActivity` that contains method +implementations that render the camera image, run classification, and display +the results. The method `processImage()` runs classification on a background +thread as fast as possible, rendering information on the UI thread to avoid +blocking inference and creating latency. + +```java +protected void processImage() { + rgbFrameBitmap.setPixels(getRgbBytes(), 0, previewWidth, 0, 0, previewWidth, previewHeight); + final Canvas canvas = new Canvas(croppedBitmap); + canvas.drawBitmap(rgbFrameBitmap, frameToCropTransform, null); + + runInBackground( + new Runnable() { + @Override + public void run() { + if (classifier != null) { + final long startTime = SystemClock.uptimeMillis(); + final List results = classifier.recognizeImage(croppedBitmap); + lastProcessingTimeMs = SystemClock.uptimeMillis() - startTime; + LOGGER.v("Detect: %s", results); + cropCopyBitmap = Bitmap.createBitmap(croppedBitmap); + + runOnUiThread( + new Runnable() { + @Override + public void run() { + showResultsInBottomSheet(results); + showFrameInfo(previewWidth + "x" + previewHeight); + showCropInfo(cropCopyBitmap.getWidth() + "x" + cropCopyBitmap.getHeight()); + showCameraResolution(canvas.getWidth() + "x" + canvas.getHeight()); + showRotationInfo(String.valueOf(sensorOrientation)); + showInference(lastProcessingTimeMs + "ms"); + } + }); + } + readyForNextImage(); + } + }); +} +``` + +Another important role of `ClassifierActivity` is to determine user preferences +(by interrogating `CameraActivity`), and instantiate the appropriately +configured `Classifier` subclass. This happens when the video feed begins (via +`onPreviewSizeChosen()`) and when options are changed in the UI (via +`onInferenceConfigurationChanged()`). + +```java +private void recreateClassifier(Model model, Device device, int numThreads) { + if (classifier != null) { + LOGGER.d("Closing classifier."); + classifier.close(); + classifier = null; + } + if (device == Device.GPU) { + if (!GpuDelegateHelper.isGpuDelegateAvailable()) { + LOGGER.d("Not creating classifier: GPU support unavailable."); + runOnUiThread( + () -> { + Toast.makeText(this, "GPU acceleration unavailable.", Toast.LENGTH_LONG).show(); + }); + return; + } else if (model == Model.QUANTIZED && device == Device.GPU) { + LOGGER.d("Not creating classifier: GPU doesn't support quantized models."); + runOnUiThread( + () -> { + Toast.makeText( + this, "GPU does not yet supported quantized models.", Toast.LENGTH_LONG) + .show(); + }); + return; + } + } + try { + LOGGER.d( + "Creating classifier (model=%s, device=%s, numThreads=%d)", model, device, numThreads); + classifier = Classifier.create(this, model, device, numThreads); + } catch (IOException e) { + LOGGER.e(e, "Failed to create classifier."); + } } ``` - -### Create interpreter - -To create the interpreter, we need to load the model file. In Android devices, -we recommend pre-loading and memory mapping the model file as shown below to -offer faster load times and reduce the dirty pages in memory. If your model file -is compressed, then you will have to load the model as a `File`, as it cannot be -directly mapped and used from memory. - -``` -// Memory-map the model file -AssetFileDescriptor fileDescriptor = assets.openFd(modelFilename); -FileInputStream inputStream = new -FileInputStream(fileDescriptor.getFileDescriptor()); FileChannel fileChannel = -inputStream.getChannel(); long startOffset = fileDescriptor.getStartOffset(); -long declaredLength = fileDescriptor.getDeclaredLength(); return -fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength); -``` - -Then, create the interpreter object using `new Interpreter()` that takes the -model file as argument as shown below. - -``` -// Create Interpreter -c.tfLite = new Interpreter(loadModelFile(assetManager, modelFilename)); -``` - -### Run inference - -The output of the inference is stored in a byte array `labelprob.` We -pre-allocate the memory for the output buffer. Then, we run inference on the -interpreter object using function `run()` that takes input and output buffers as -arguments. - -``` -// Pre-allocate output buffers. -c.labelProb = new byte[1][c.labels.size()]; -// Run Inference -tfLite.run(imgData, labelProb); -``` - -### Post-process values - -Finally, we find the best set of classifications by storing them in a priority -queue based on their confidence scores. - -``` -// Find the best classifications -PriorityQueue pq = ... -for (int i = 0; i < labels.size(); ++i) -{ - pq.add( new Recognition( ' '+ i, - labels.size() > i ? labels.get(i) : unknown, - (float) labelProb[0][i], null)); -} -``` - -And we display up to MAX_RESULTS number of classifications in the application, -where Recognition is a generic class defined in `Classifier.java` that contains -the following information of the classified object: id, title, label, and its -location when the model is an object detection model. - -``` -// Display the best classifications -final ArrayList recognitions = - new ArrayList(); -int recognitionsSize = Math.min(pq.size(), MAX_RESULTS); -for (int i = 0; i < recognitionsSize; ++i) { - recognitions.add(pq.poll()); -} -``` - -### Load onto display - -We render the results on the Android device screen using the following lines in -`processImage()` function in `ClassifierActivity.java` which uses the UI defined -in `RecognitionScoreView.java.` - -``` -resultsView.setResults(results); -requestRender(); -``` diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img1.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img1.png deleted file mode 100644 index 916639c0670..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img1.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img2.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img2.png deleted file mode 100644 index 366ec834a84..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img2.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img4.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img4.png deleted file mode 100644 index 360b843c943..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img4.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img5.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img5.png deleted file mode 100644 index d6192ae9a76..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img5.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img6.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img6.png deleted file mode 100644 index 4216153d388..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img6.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img7.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img7.png deleted file mode 100644 index 034eedbc1e5..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img7.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img8.png b/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img8.png deleted file mode 100644 index 94039534651..00000000000 Binary files a/tensorflow/lite/g3doc/models/image_classification/images/classifydemo_img8.png and /dev/null differ diff --git a/tensorflow/lite/g3doc/models/image_classification/ios.md b/tensorflow/lite/g3doc/models/image_classification/ios.md index 63e3abd7793..feb172d2d72 100644 --- a/tensorflow/lite/g3doc/models/image_classification/ios.md +++ b/tensorflow/lite/g3doc/models/image_classification/ios.md @@ -1,229 +1,221 @@ -# TensorFlow Lite iOS Image Classifier App Example +# TensorFlow Lite iOS image classification example -This tutorial provides a simple iOS mobile application to classify images using -the iOS device camera. In this tutorial, you will download the demo application -from the Tensorflow repository, build it on your computer, and install it on -your iOS Device. You will also learn how to customize the application to suit -your needs. +This document walks through the code of a simple iOS mobile application that +demonstrates [image classification](overview.md) using the device camera. -## Prerequisites +The application code is located in the +[Tensorflow examples](https://github.com/tensorflow/examples) repository, along +with instructions for building and deploying the app. -* You must have [Xcode](https://developer.apple.com/xcode/) installed and have - a valid Apple Developer ID, and have an iOS device set up and linked to your - developer account with all of the appropriate certificates. For these - instructions, we assume that you have already been able to build and deploy - an app to an iOS device with your current developer environment. +Example +application -* The demo app requires a camera and must be executed on a real iOS device. - You can build it and run with the iPhone Simulator but it won't have any - camera information to classify. +## Explore the code -* You don't need to build the entire TensorFlow library to run the demo, but - you will need to clone the TensorFlow repository if you haven't already: +We're now going to walk through the most important parts of the sample code. - git clone https://github.com/tensorflow/tensorflow - cd tensorflow - -* You'll also need the Xcode command-line tools: - - xcode-select --install - - If this is a new install, you will need to run the Xcode application once to - agree to the license before continuing. - -* Install CocoaPods if you don't have it: - - sudo gem install cocoapods - -### Step 1. Clone the TensorFlow source code - -lone the GitHub repository onto your computer to get the -demo application. - -``` -git clone https://github.com/tensorflow/tensorflow -``` - -### Step 2. Download required dependencies - -Execute the shell script to download the model files used by the demo app (this -is done from inside the cloned directory): - -``` - tensorflow/lite/examples/ios/download_models.sh -``` - -Run the following command to install TensorFlow Lite pod: - -``` - cd tensorflow/lite/examples/ios/camera - pod install -``` - -If you have installed this pod before and that command doesn't work, try - -``` - pod repo update -``` - -### Step 3. Build the XCode project - -Open the `tflite_camera_example.xcworkspace` project file generated in the last -step: - -``` - open tflite_camera_example.xcworkspace -``` - -Under `Project navigator -> tflite_camera_example -> Targets -> -tflite_camera_example -> General` change the bundle identifier by pre-pending -your name: - -![pre-pend your name to the bundle identifier](images/bundle_identifier.png) - -Plug in your iOS device. Note that the app must be executed with a real device with -a camera. Select the iOS device from the drop-down menu. - -![Device selection](images/device_selection.png) - -Click the "Run" button to build and run the app - -![Build and execute](images/build_and_execute.png) - -Note that, as mentioned earlier, you must already have a device set up and linked -to your Apple Developer account in order to deploy the app onto a device. - -You'll have to grant permissions for the app to use the device's camera. Point -the camera at various objects and enjoy seeing how the model classifies things! - -## Understanding iOS App Code +This example is written in both Swift and Objective-C. All application +functionality, image processing, and results formatting is developed in Swift. +Objective-C is used via +[bridging](https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_objective-c_into_swift) +to make the TensorFlow Lite C++ framework calls. ### Get camera input -The main logic of this app is in the Objective C++ source file -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. +The main logic of this app is in the Swift source file +[`ViewController.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ViewControllers/ViewController.swift). -The `setupAVCapture` method constructs a `AVCaptureSession` and set itself as a -delegate. The `captureOutput:didOutputSampleBuffer:fromConnection:` method is -called for every captured frame. It calls `runModelOnFrame` to run the model for -every frame. +The app's main view is represented by the `ViewController` class, which we +extend with functionality from `CameraFeedManagerDelegate`, a class created to +handle a camera feed. To run inference on the feed, we implement the `didOutput` +method, which is called whenever a frame is available from the camera. -### Create an interpreter +Our implementation of `didOutput` includes a call to the `runModel` method of a +`ModelDataHandler` instance. As we will see below, this class gives us access to +the TensorFlow Lite interpreter and the image classification model we are using. -To create the interpreter, we need to load the model file. The following code -will load a model and create an interpreter. +```swift +extension ViewController: CameraFeedManagerDelegate { -``` -model = tflite::FlatBufferModel::BuildFromFile([graph_path UTF8String]); + func didOutput(pixelBuffer: CVPixelBuffer) { + + // Run the live camera pixelBuffer through TensorFlow to get the result + let currentTimeMs = Date().timeIntervalSince1970 * 1000 + + guard (currentTimeMs - previousInferenceTimeMs) >= delayBetweenInferencesMs else { + return + } + + previousInferenceTimeMs = currentTimeMs + result = modelDataHandler?.runModel(onFrame: pixelBuffer) + + DispatchQueue.main.async { + + let resolution = CGSize(width: CVPixelBufferGetWidth(pixelBuffer), height: CVPixelBufferGetHeight(pixelBuffer)) + + // Display results by handing off to the InferenceViewController + self.inferenceViewController?.inferenceResult = self.result + self.inferenceViewController?.resolution = resolution + self.inferenceViewController?.tableView.reloadData() + + } + } +... ``` -Behind the scenes, the model is loaded as a memory-mapped file. It offers faster -load times and reduce the dirty pages in memory. +### TensorFlow Lite wrapper -Construct a `BuiltinOpResolver` to use the TensorFliw Lite buildin ops. Then, -create the interpreter object using `InterpreterBuilder` that takes the model -file as argument as shown below. +The app uses TensorFlow Lite's C++ library via an Objective-C wrapper defined in +[`TfliteWrapper.h`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorFlowLiteWrapper/TfliteWrapper.h) +and +[`TfliteWrapper.mm`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/TensorFlowLiteWrapper/TfliteWrapper.mm). -``` -tflite::ops::builtin::BuiltinOpResolver resolver; -tflite::InterpreterBuilder(*model, resolver)(&interpreter); +This wrapper is required because currently there is no interoperability between +Swift and C++. The wrapper is exposed to Swift via bridging so that the +Tensorflow Lite methods can be called from Swift. + +### ModelDataHandler + +The Swift class `ModelDataHandler`, defined by +[`ModelDataHandler.swift`](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/ImageClassification/ModelDataHandler/ModelDataHandler.swift), +handles all data preprocessing and makes calls to run inference on a given frame +through the TfliteWrapper. It then formats the inferences obtained and returns +the top N results for a successful inference. + +The following sections show how this works. + +#### Initialization + +The method `init` instantiates a `TfliteWrapper` and loads the supplied model +and labels files from disk. + +```swift +init?(modelFileName: String, labelsFileName: String, labelsFileExtension: String) { + + // Initializes TFliteWrapper and based on the setup result of interpreter, initializes the object of this class + self.tfLiteWrapper = TfliteWrapper(modelFileName: modelFileName) + guard self.tfLiteWrapper.setUpModelAndInterpreter() else { + return nil + } + + super.init() + + tfLiteWrapper.setNumberOfThreads(threadCount) + + // Opens and loads the classes listed in the labels file + loadLabels(fromFileName: labelsFileName, fileExtension: labelsFileExtension) +} ``` -### Obtain the input buffer +#### Process input -By default, the app uses a quantized model since it's smaller and faster. The -buffer is a raw pointer to an array of 8 bit unsigned integers (`uint8_t`). The -following code obtains the input buffer from the interpreter: +The method `runModel` accepts a `CVPixelBuffer` of camera data, which can be +obtained from the `didOutput` method defined in `ViewController`. -``` -// Get the index of first input tensor. -int input_tensor_index = interpreter->inputs()[0]; -// Get the pointer to the input buffer. -uint8_t* buffer = interpreter->typed_tensor(input_tensor_index); +We crop the image, call `CVPixelBufferLockBaseAddress` to prepare the buffer to +be read by the CPU, and then create an input tensor using the TensorFlow Lite +wrapper: + +```swift +guard let tensorInputBaseAddress = tfLiteWrapper.inputTensor(at: 0) else { + return nil +} ``` -Throughout this document, it's assumed that a quantized model is used. +The image buffer contains an encoded color for each pixel in `BGRA` format +(where `A` represents Alpha, or transparency), and our model expects it in `RGB` +format. We now step through the buffer four bytes at a time, copying the three +bytes we care about (`R`, `G`, and `B`) to the input tensor. -### Pre-process bitmap image +Note: Since we are using a quantized model, we can directly use the `UInt8` +values from the buffer. If we were using a float model, we would have to convert +them to floating point by dividing by 255. -The MobileNet model that we're using takes 224x224x3 inputs, where the dimensions are -width, height, and colors (RGB). The images returned from `AVCaptureSession` is -bigger and has 4 color channels (RGBA). +```swift +let inputImageBaseAddress = sourceStartAddrss.assumingMemoryBound(to: UInt8.self) -Many image classification models (like MobileNet) take fixe-sized inputs. It's -required to scale or crop the image before feeding it into the model and change -the channels from RGBA to RGB. +for y in 0...wantedInputHeight - 1 { + let tensorInputRow = tensorInputBaseAddress.advanced(by: (y * wantedInputWidth * wantedInputChannels)) + let inputImageRow = inputImageBaseAddress.advanced(by: y * wantedInputWidth * imageChannels) -The code to pre-process the images is in `ProcessInputWithQuantizedModel` -function in -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. It's a -simple implementation for nearest neighbor color sampling and it only copies -the first 3 bytes for each pixel. + for x in 0...wantedInputWidth - 1 { -``` -void ProcessInputWithQuantizedModel( - uint8_t* input, uint8_t* output, int image_width, int image_height, int image_channels) { - for (int y = 0; y < wanted_input_height; ++y) { - uint8_t* out_row = output + (y * wanted_input_width * wanted_input_channels); - for (int x = 0; x < wanted_input_width; ++x) { - const int in_x = (y * image_width) / wanted_input_width; - const int in_y = (x * image_height) / wanted_input_height; - uint8_t* in_pixel = input + (in_y * image_width * image_channels) + (in_x * image_channels); - uint8_t* out_pixel = out_row + (x * wanted_input_channels); - for (int c = 0; c < wanted_input_channels; ++c) { - out_pixel[c] = in_pixel[c]; - } + let out_pixel = tensorInputRow.advanced(by: x * wantedInputChannels) + let in_pixel = inputImageRow.advanced(by: x * imageChannels) + + var b = 2 + for c in 0...(wantedInputChannels) - 1 { + + // We are reversing the order of pixels since the source pixel format is BGRA, but the model requires RGB format. + out_pixel[c] = in_pixel[b] + b = b - 1 } } } ``` -Note that the code pre-processes and prepares the model input from the camera -data. Therefore, the first parameter `input` should be the camera buffer. The -second parameter `output` should be the buffer of model input. +#### Run inference -### Run inference and obtain output buffer +Running inference is a simple call to `tfLiteWrapper.invokeInterpreter()`. The +result of this synchronous call can be obtained by calling +`tfLiteWrapper.outputTensor()`. -After pre-processing and filling the data into the input buffer of the -interpreter, it's really easy to run the interpreter: +```swift +guard tfLiteWrapper.invokeInterpreter() else { + return nil +} -``` -if (interpreter->Invoke() != kTfLiteOk) { - NSLog("Failed to invoke!"); +guard let outputTensor = tfLiteWrapper.outputTensor(at: 0) else { + return nil } ``` -The result is stored in the output tensor buffer of the interpreter. The -following code obtains the pointer to the buffer: +#### Process results -``` -// Get the index of first output tensor. -const int output_tensor_index = interpreter->outputs()[0]; -// Get the pointer to the output buffer. -uint8_t* buffer = interpreter->typed_tensor(output_tensor_index); -``` +The `getTopN` method, also declared in `ModelDataHandler.swift`, interprets the +contents of the output tensor. It returns a list of the top N predictions, +ordered by confidence. -### Post-process values +The output tensor contains one `UInt8` value per class label, with a value +between 0 and 255 corresponding to a confidence of 0 to 100% that each label is +present in the image. -The output buffer contains an array of `uint8_t`, and the value range is from 0-255. -We need to convert the value to float to get the probabilities with a value range from -0.0-1.0. The formula of the quantization value mapping is: +First, the results are mapped into an array of `Inference` instances, each with +a `confidence` between 0 and 1 and a `className` representing the label. - float_value = (quantized_value - zero_point) * scale +```swift +for i in 0...predictionSize - 1 { + let value = Double(prediction[i]) / 255.0 -The following code converts quantized values back to float values, using the -quantizaiton parameters in tensors: + guard i < labels.count else { + continue + } -``` -uint8_t* quantized_output = interpreter->typed_output_tensor(0); -int32_t zero_point = input_tensor->params.zero_point; -float scale = input_tensor->params.scale; -float output[output_size]; -for (int i = 0; i < output_size; ++i) { - output[i] = (quantized_output[i] - zero_point) * scale; + let inference = Inference(confidence: value, className: labels[i]) + resultsArray.append(inference) } ``` -Finally, we find the best set of classifications by storing them in a priority -queue based on their confidence scores. See the `GetTopN` function in -`tensorflow/lite/examples/ios/camera/CameraExampleViewController.mm`. +Next, the results are sorted, and we return the top `N` (where N is +`resultCount`). + +```swift +resultsArray.sort { (first, second) -> Bool in + return first.confidence > second.confidence +} + +guard resultsArray.count > resultCount else { + return resultsArray +} +let finalArray = resultsArray[0..What is image classification? -If you understand image classification, you’re new to TensorFlow Lite, and -you’re working with Android or iOS, we recommend following the corresponding -tutorial that will walk you through our sample code. - -Android -iOS - -We also provide example applications you can -use to get started. +To learn how to use image classification in a mobile app, we recommend exploring +our Example applications and guides. If you are using a platform other than Android or iOS, or you are already -familiar with the -TensorFlow Lite -APIs, you can download our starter image classification model and the -accompanying labels. +familiar with the TensorFlow Lite APIs, you can download our starter image +classification model and the accompanying labels. Download starter model and labels @@ -35,16 +26,28 @@ experiment with different models to find the optimal balance between performance, accuracy, and model size. For guidance, see Choose a different model. -### Example applications +### Example applications and guides We have example applications for image classification for both Android and iOS. +For each example, we provide a guide that explains how it works. -Android -example -iOS -example +#### Android -The following screenshot shows the Android image classification example: +View +Android example + +Read the [Android example guide](android.md) to learn how the app works. + +#### iOS + +View +iOS example + +Read the [iOS example guide](ios.md) to learn how the app works. + +#### Screenshot + +The following screenshot shows the Android image classification example. Screenshot of Android example @@ -199,8 +202,8 @@ If you want to train a model to recognize new classes, see For the following use cases, you should use a different type of model:
    -
  • Predicting the type and position of one or more objects within an image (see object detection)
  • -
  • Predicting the composition of an image, for example subject versus background (see segmentation)
  • +
  • Predicting the type and position of one or more objects within an image (see Object detection)
  • +
  • Predicting the composition of an image, for example subject versus background (see Segmentation)
Once you have the starter model running on your target device, you can @@ -226,7 +229,7 @@ analyze each frame in the time before the next frame is drawn (e.g. inference must be faster than 33ms to perform real-time inference on a 30fps video stream). -Our quantized Mobilenet models’ performance ranges from 3.7ms to 80.3 ms. +Our quantized MobileNet models’ performance ranges from 3.7ms to 80.3 ms. ### Accuracy @@ -240,7 +243,7 @@ appears as the label with the highest probability in the model’s output. Top-5 refers to how often the correct label appears in the top 5 highest probabilities in the model’s output. -Our quantized Mobilenet models’ Top-5 accuracy ranges from 64.4 to 89.9%. +Our quantized MobileNet models’ Top-5 accuracy ranges from 64.4 to 89.9%. ### Size @@ -248,13 +251,13 @@ The size of a model on-disk varies with its performance and accuracy. Size may be important for mobile development (where it might impact app download sizes) or when working with hardware (where available storage might be limited). -Our quantized Mobilenet models’ size ranges from 0.5 to 3.4 Mb. +Our quantized MobileNet models’ size ranges from 0.5 to 3.4 Mb. ### Architecture There are several different architectures of models available on List of hosted models, indicated by -the model’s name. For example, you can choose between Mobilenet, Inception, and +the model’s name. For example, you can choose between MobileNet, Inception, and others. The architecture of a model impacts its performance, accuracy, and size. All of