Added documentation for TFLite text_classification and bert_qa model

PiperOrigin-RevId: 276590063 Change-Id: Id29cf4d7be49404352455826cca2c06b6d92dafb
2019-10-24 16:26:08 -07:00 · 2019-10-24 16:26:08 -07:00 · eb0d31e7b7
commit eb0d31e7b7
parent b74e8fb4a5
4 changed files with 148 additions and 0 deletions
--- a/tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
+++ b/tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
--- a/tensorflow/lite/g3doc/models/bert_qa/overview.md
+++ b/tensorflow/lite/g3doc/models/bert_qa/overview.md
@ -0,0 +1,83 @@
+# Question and Answer
+
+Use a pre-trained model to answer questions based on the content of a given
+passage.
+
+## Get started
+
+<img src="images/screenshot.gif" class="attempt-right" style="max-width: 300px">
+
+If you are new to TensorFlow Lite and are working with Android, we recommend
+exploring the following example applications that can help you get started.
+
+<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/bert_qa/android">Android
+example</a>
+
+If you are using a platform other than Android, or you are already familiar with
+the [TensorFlow Lite APIs](https://www.tensorflow.org/api_docs/python/tf/lite),
+you can download our starter question and answer model.
+
+<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/bert_qa/mobilebert_qa_vocab.zip">Download
+starter model and vocab</a>
+
+## How it works
+
+The model can be used to build a system that can answer users’ questions in
+natural language. It was created using a pre-trained BERT model fine-tuned on
+SQuAD 1.1 dataset.
+
+[BERT](https://github.com/google-research/bert), or Bidirectional Encoder
+Representations from Transformers, is a method of pre-training language
+representations which obtains state-of-the-art results on a wide array of
+Natural Language Processing tasks.
+
+This app uses a compressed version of BERT, MobileBERT, that runs 4x faster and
+has 4x smaller model size.
+
+[SQuAD](https://rajpurkar.github.io/SQuAD-explorer/), or Stanford Question
+Answering Dataset, is a reading comprehension dataset consisting of articles
+from Wikipedia and a set of question-answer pairs for each article.
+
+The model takes a passage and a question as input, then returns a segment of the
+passage that most likely answers the question. It requires semi-complex
+pre-processing including tokenization and post-processing steps that are
+described in the BERT [paper](https://arxiv.org/abs/1810.04805) and implemented
+in the sample app.
+
+## Example output
+
+### Passage (Input)
+
+> Google LLC is an American multinational technology company that specializes in
+> Internet-related services and products, which include online advertising
+> technologies, search engine, cloud computing, software, and hardware. It is
+> considered one of the Big Four technology companies, alongside Amazon, Apple,
+> and Facebook.
+>
+> Google was founded in September 1998 by Larry Page and Sergey Brin while they
+> were Ph.D. students at Stanford University in California. Together they own
+> about 14 percent of its shares and control 56 percent of the stockholder
+> voting power through supervoting stock. They incorporated Google as a
+> California privately held company on September 4, 1998, in California. Google
+> was then reincorporated in Delaware on October 22, 2002. An initial public
+> offering (IPO) took place on August 19, 2004, and Google moved to its
+> headquarters in Mountain View, California, nicknamed the Googleplex. In August
+> 2015, Google announced plans to reorganize its various interests as a
+> conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and
+> will continue to be the umbrella company for Alphabet's Internet interests.
+> Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the
+> CEO of Alphabet.
+
+### Question (Input)
+
+> Who is the CEO of Google?
+
+### Answer (Output)
+
+> Sundar Pichai
+
+## Read more about BERT
+
+*   Academic paper: [BERT: Pre-training of Deep Bidirectional Transformers for
+    Language Understanding](https://arxiv.org/abs/1810.04805)
+*   [Open-source implementation of BERT](https://github.com/google-research/bert)
--- a/tensorflow/lite/g3doc/models/text_classification/images/screenshot.png
+++ b/tensorflow/lite/g3doc/models/text_classification/images/screenshot.png
--- a/tensorflow/lite/g3doc/models/text_classification/overview.md
+++ b/tensorflow/lite/g3doc/models/text_classification/overview.md
@ -0,0 +1,65 @@
+# Text Classification
+
+Use a pre-trained model to category a paragraph into predefined groups.
+
+## Get started
+
+<img src="images/screenshot.png" class="attempt-right" style="max-width: 300px">
+
+If you are new to TensorFlow Lite and are working with Android, we recommend
+exploring the following example applications that can help you get started.
+
+<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/text_classification/android">Android
+example</a>
+
+If you are using a platform other than Android, or you are already familiar with
+the TensorFlow Lite APIs, you can download our starter text classification
+model.
+
+<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/text_classification/text_classification.tflite">Download
+starter model</a>
+
+## How it works
+
+Text classification categorizes a paragraph into predefined groups based on its
+content.
+
+This pretrained model predicts if a paragraph's sentiment is positive or
+negative. It was trained on
+[Large Movie Review Dataset v1.0](http://ai.stanford.edu/~amaas/data/sentiment/)
+from Mass et al, which consists of IMDB movie reviews labeled as either positive
+or negative.
+
+Here are the steps to classify a paragraph with the model:
+
+1.  Tokenize the paragraph and convert it to a list of word ids using a
+    predefined vocabulary.
+1.  Feed the list to the TensorFlow Lite model.
+1.  Get the probability of the paragraph being positive or negative from the
+    model outputs.
+
+### Note
+
+*   Only English is supported.
+*   This model was trained on movie reviews dataset so you may experience
+    reduced accuracy when classifying text of other domains.
+
+## Example output
+
+| Text                                       | Negative (0) | Positive (1) |
+| ------------------------------------------ | ------------ | ------------ |
+| This is the best movie I’ve seen in recent | 25.3%        | 74.7%        |
+: years. Strongly recommend it!              :              :              :
+| What a waste of my time.                   | 72.5%        | 27.5%        |
+
+## Use your training dataset
+
+Follow this
+[tutorial](https://github.com/tensorflow/examples/tree/master/lite/examples/model_customization/demo/image_classification.ipynb)
+to apply the same technique used here to train a text classification model using
+your own datasets. With the right dataset, you can create a model for use cases
+such as document categorization or toxic comments detection.
+
+## Read more about text classification
+
+*   [Word embeddings and tutorial to train this model](https://www.tensorflow.org/tutorials/text/word_embeddings)