Added documentation for TFLite text_classification and bert_qa model

PiperOrigin-RevId: 276590063 Change-Id: Id29cf4d7be49404352455826cca2c06b6d92dafb
2019-10-24 16:26:08 -07:00 · 2019-10-24 16:26:08 -07:00 · eb0d31e7b7
commit eb0d31e7b7
parent b74e8fb4a5
4 changed files with 148 additions and 0 deletions
--- a/tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
+++ b/tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
--- a/tensorflow/lite/g3doc/models/bert_qa/overview.md
+++ b/tensorflow/lite/g3doc/models/bert_qa/overview.md
@ -0,0 +1,83 @@
 # Question and Answer
 Use a pre-trained model to answer questions based on the content of a given
 passage.
 ## Get started
 <img src="images/screenshot.gif" class="attempt-right" style="max-width: 300px">
 If you are new to TensorFlow Lite and are working with Android, we recommend
 exploring the following example applications that can help you get started.
 <a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/bert_qa/android">Android
 example</a>
 If you are using a platform other than Android, or you are already familiar with
 the [TensorFlow Lite APIs](https://www.tensorflow.org/api_docs/python/tf/lite),
 you can download our starter question and answer model.
 <a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/bert_qa/mobilebert_qa_vocab.zip">Download
 starter model and vocab</a>
 ## How it works
 The model can be used to build a system that can answer users’ questions in
 natural language. It was created using a pre-trained BERT model fine-tuned on
 SQuAD 1.1 dataset.
 [BERT](https://github.com/google-research/bert), or Bidirectional Encoder
 Representations from Transformers, is a method of pre-training language
 representations which obtains state-of-the-art results on a wide array of
 Natural Language Processing tasks.
 This app uses a compressed version of BERT, MobileBERT, that runs 4x faster and
 has 4x smaller model size.
 [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/), or Stanford Question
 Answering Dataset, is a reading comprehension dataset consisting of articles
 from Wikipedia and a set of question-answer pairs for each article.
 The model takes a passage and a question as input, then returns a segment of the
 passage that most likely answers the question. It requires semi-complex
 pre-processing including tokenization and post-processing steps that are
 described in the BERT [paper](https://arxiv.org/abs/1810.04805) and implemented
 in the sample app.
 ## Example output
 ### Passage (Input)
 > Google LLC is an American multinational technology company that specializes in
 > Internet-related services and products, which include online advertising
 > technologies, search engine, cloud computing, software, and hardware. It is
 > considered one of the Big Four technology companies, alongside Amazon, Apple,
 > and Facebook.
 >
 > Google was founded in September 1998 by Larry Page and Sergey Brin while they
 > were Ph.D. students at Stanford University in California. Together they own
 > about 14 percent of its shares and control 56 percent of the stockholder
 > voting power through supervoting stock. They incorporated Google as a
 > California privately held company on September 4, 1998, in California. Google
 > was then reincorporated in Delaware on October 22, 2002. An initial public
 > offering (IPO) took place on August 19, 2004, and Google moved to its
 > headquarters in Mountain View, California, nicknamed the Googleplex. In August
 > 2015, Google announced plans to reorganize its various interests as a
 > conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and
 > will continue to be the umbrella company for Alphabet's Internet interests.
 > Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the
 > CEO of Alphabet.
 ### Question (Input)
 > Who is the CEO of Google?
 ### Answer (Output)
 > Sundar Pichai
 ## Read more about BERT
 *   Academic paper: [BERT: Pre-training of Deep Bidirectional Transformers for
    Language Understanding](https://arxiv.org/abs/1810.04805)
 *   [Open-source implementation of BERT](https://github.com/google-research/bert)
--- a/tensorflow/lite/g3doc/models/text_classification/images/screenshot.png
+++ b/tensorflow/lite/g3doc/models/text_classification/images/screenshot.png
--- a/tensorflow/lite/g3doc/models/text_classification/overview.md
+++ b/tensorflow/lite/g3doc/models/text_classification/overview.md
@ -0,0 +1,65 @@
 # Text Classification
 Use a pre-trained model to category a paragraph into predefined groups.
 ## Get started
 <img src="images/screenshot.png" class="attempt-right" style="max-width: 300px">
 If you are new to TensorFlow Lite and are working with Android, we recommend
 exploring the following example applications that can help you get started.
 <a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/text_classification/android">Android
 example</a>
 If you are using a platform other than Android, or you are already familiar with
 the TensorFlow Lite APIs, you can download our starter text classification
 model.
 <a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/text_classification/text_classification.tflite">Download
 starter model</a>
 ## How it works
 Text classification categorizes a paragraph into predefined groups based on its
 content.
 This pretrained model predicts if a paragraph's sentiment is positive or
 negative. It was trained on
 [Large Movie Review Dataset v1.0](http://ai.stanford.edu/~amaas/data/sentiment/)
 from Mass et al, which consists of IMDB movie reviews labeled as either positive
 or negative.
 Here are the steps to classify a paragraph with the model:
 1.  Tokenize the paragraph and convert it to a list of word ids using a
    predefined vocabulary.
 1.  Feed the list to the TensorFlow Lite model.
 1.  Get the probability of the paragraph being positive or negative from the
    model outputs.
 ### Note
 *   Only English is supported.
 *   This model was trained on movie reviews dataset so you may experience
    reduced accuracy when classifying text of other domains.
 ## Example output
 | Text                                       | Negative (0) | Positive (1) |
 | ------------------------------------------ | ------------ | ------------ |
 | This is the best movie I’ve seen in recent | 25.3%        | 74.7%        |
 : years. Strongly recommend it!              :              :              :
 | What a waste of my time.                   | 72.5%        | 27.5%        |
 ## Use your training dataset
 Follow this
 [tutorial](https://github.com/tensorflow/examples/tree/master/lite/examples/model_customization/demo/image_classification.ipynb)
 to apply the same technique used here to train a text classification model using
 your own datasets. With the right dataset, you can create a model for use cases
 such as document categorization or toxic comments detection.
 ## Read more about text classification
 *   [Word embeddings and tutorial to train this model](https://www.tensorflow.org/tutorials/text/word_embeddings)