Added documentation for TFLite text_classification and bert_qa model
PiperOrigin-RevId: 276590063 Change-Id: Id29cf4d7be49404352455826cca2c06b6d92dafb
This commit is contained in:
parent
b74e8fb4a5
commit
eb0d31e7b7
BIN
tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
Normal file
BIN
tensorflow/lite/g3doc/models/bert_qa/images/screenshot.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 625 KiB |
83
tensorflow/lite/g3doc/models/bert_qa/overview.md
Normal file
83
tensorflow/lite/g3doc/models/bert_qa/overview.md
Normal file
@ -0,0 +1,83 @@
|
|||||||
|
# Question and Answer
|
||||||
|
|
||||||
|
Use a pre-trained model to answer questions based on the content of a given
|
||||||
|
passage.
|
||||||
|
|
||||||
|
## Get started
|
||||||
|
|
||||||
|
<img src="images/screenshot.gif" class="attempt-right" style="max-width: 300px">
|
||||||
|
|
||||||
|
If you are new to TensorFlow Lite and are working with Android, we recommend
|
||||||
|
exploring the following example applications that can help you get started.
|
||||||
|
|
||||||
|
<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/bert_qa/android">Android
|
||||||
|
example</a>
|
||||||
|
|
||||||
|
If you are using a platform other than Android, or you are already familiar with
|
||||||
|
the [TensorFlow Lite APIs](https://www.tensorflow.org/api_docs/python/tf/lite),
|
||||||
|
you can download our starter question and answer model.
|
||||||
|
|
||||||
|
<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/bert_qa/mobilebert_qa_vocab.zip">Download
|
||||||
|
starter model and vocab</a>
|
||||||
|
|
||||||
|
## How it works
|
||||||
|
|
||||||
|
The model can be used to build a system that can answer users’ questions in
|
||||||
|
natural language. It was created using a pre-trained BERT model fine-tuned on
|
||||||
|
SQuAD 1.1 dataset.
|
||||||
|
|
||||||
|
[BERT](https://github.com/google-research/bert), or Bidirectional Encoder
|
||||||
|
Representations from Transformers, is a method of pre-training language
|
||||||
|
representations which obtains state-of-the-art results on a wide array of
|
||||||
|
Natural Language Processing tasks.
|
||||||
|
|
||||||
|
This app uses a compressed version of BERT, MobileBERT, that runs 4x faster and
|
||||||
|
has 4x smaller model size.
|
||||||
|
|
||||||
|
[SQuAD](https://rajpurkar.github.io/SQuAD-explorer/), or Stanford Question
|
||||||
|
Answering Dataset, is a reading comprehension dataset consisting of articles
|
||||||
|
from Wikipedia and a set of question-answer pairs for each article.
|
||||||
|
|
||||||
|
The model takes a passage and a question as input, then returns a segment of the
|
||||||
|
passage that most likely answers the question. It requires semi-complex
|
||||||
|
pre-processing including tokenization and post-processing steps that are
|
||||||
|
described in the BERT [paper](https://arxiv.org/abs/1810.04805) and implemented
|
||||||
|
in the sample app.
|
||||||
|
|
||||||
|
## Example output
|
||||||
|
|
||||||
|
### Passage (Input)
|
||||||
|
|
||||||
|
> Google LLC is an American multinational technology company that specializes in
|
||||||
|
> Internet-related services and products, which include online advertising
|
||||||
|
> technologies, search engine, cloud computing, software, and hardware. It is
|
||||||
|
> considered one of the Big Four technology companies, alongside Amazon, Apple,
|
||||||
|
> and Facebook.
|
||||||
|
>
|
||||||
|
> Google was founded in September 1998 by Larry Page and Sergey Brin while they
|
||||||
|
> were Ph.D. students at Stanford University in California. Together they own
|
||||||
|
> about 14 percent of its shares and control 56 percent of the stockholder
|
||||||
|
> voting power through supervoting stock. They incorporated Google as a
|
||||||
|
> California privately held company on September 4, 1998, in California. Google
|
||||||
|
> was then reincorporated in Delaware on October 22, 2002. An initial public
|
||||||
|
> offering (IPO) took place on August 19, 2004, and Google moved to its
|
||||||
|
> headquarters in Mountain View, California, nicknamed the Googleplex. In August
|
||||||
|
> 2015, Google announced plans to reorganize its various interests as a
|
||||||
|
> conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and
|
||||||
|
> will continue to be the umbrella company for Alphabet's Internet interests.
|
||||||
|
> Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the
|
||||||
|
> CEO of Alphabet.
|
||||||
|
|
||||||
|
### Question (Input)
|
||||||
|
|
||||||
|
> Who is the CEO of Google?
|
||||||
|
|
||||||
|
### Answer (Output)
|
||||||
|
|
||||||
|
> Sundar Pichai
|
||||||
|
|
||||||
|
## Read more about BERT
|
||||||
|
|
||||||
|
* Academic paper: [BERT: Pre-training of Deep Bidirectional Transformers for
|
||||||
|
Language Understanding](https://arxiv.org/abs/1810.04805)
|
||||||
|
* [Open-source implementation of BERT](https://github.com/google-research/bert)
|
Binary file not shown.
After Width: | Height: | Size: 305 KiB |
65
tensorflow/lite/g3doc/models/text_classification/overview.md
Normal file
65
tensorflow/lite/g3doc/models/text_classification/overview.md
Normal file
@ -0,0 +1,65 @@
|
|||||||
|
# Text Classification
|
||||||
|
|
||||||
|
Use a pre-trained model to category a paragraph into predefined groups.
|
||||||
|
|
||||||
|
## Get started
|
||||||
|
|
||||||
|
<img src="images/screenshot.png" class="attempt-right" style="max-width: 300px">
|
||||||
|
|
||||||
|
If you are new to TensorFlow Lite and are working with Android, we recommend
|
||||||
|
exploring the following example applications that can help you get started.
|
||||||
|
|
||||||
|
<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/text_classification/android">Android
|
||||||
|
example</a>
|
||||||
|
|
||||||
|
If you are using a platform other than Android, or you are already familiar with
|
||||||
|
the TensorFlow Lite APIs, you can download our starter text classification
|
||||||
|
model.
|
||||||
|
|
||||||
|
<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/text_classification/text_classification.tflite">Download
|
||||||
|
starter model</a>
|
||||||
|
|
||||||
|
## How it works
|
||||||
|
|
||||||
|
Text classification categorizes a paragraph into predefined groups based on its
|
||||||
|
content.
|
||||||
|
|
||||||
|
This pretrained model predicts if a paragraph's sentiment is positive or
|
||||||
|
negative. It was trained on
|
||||||
|
[Large Movie Review Dataset v1.0](http://ai.stanford.edu/~amaas/data/sentiment/)
|
||||||
|
from Mass et al, which consists of IMDB movie reviews labeled as either positive
|
||||||
|
or negative.
|
||||||
|
|
||||||
|
Here are the steps to classify a paragraph with the model:
|
||||||
|
|
||||||
|
1. Tokenize the paragraph and convert it to a list of word ids using a
|
||||||
|
predefined vocabulary.
|
||||||
|
1. Feed the list to the TensorFlow Lite model.
|
||||||
|
1. Get the probability of the paragraph being positive or negative from the
|
||||||
|
model outputs.
|
||||||
|
|
||||||
|
### Note
|
||||||
|
|
||||||
|
* Only English is supported.
|
||||||
|
* This model was trained on movie reviews dataset so you may experience
|
||||||
|
reduced accuracy when classifying text of other domains.
|
||||||
|
|
||||||
|
## Example output
|
||||||
|
|
||||||
|
| Text | Negative (0) | Positive (1) |
|
||||||
|
| ------------------------------------------ | ------------ | ------------ |
|
||||||
|
| This is the best movie I’ve seen in recent | 25.3% | 74.7% |
|
||||||
|
: years. Strongly recommend it! : : :
|
||||||
|
| What a waste of my time. | 72.5% | 27.5% |
|
||||||
|
|
||||||
|
## Use your training dataset
|
||||||
|
|
||||||
|
Follow this
|
||||||
|
[tutorial](https://github.com/tensorflow/examples/tree/master/lite/examples/model_customization/demo/image_classification.ipynb)
|
||||||
|
to apply the same technique used here to train a text classification model using
|
||||||
|
your own datasets. With the right dataset, you can create a model for use cases
|
||||||
|
such as document categorization or toxic comments detection.
|
||||||
|
|
||||||
|
## Read more about text classification
|
||||||
|
|
||||||
|
* [Word embeddings and tutorial to train this model](https://www.tensorflow.org/tutorials/text/word_embeddings)
|
Loading…
x
Reference in New Issue
Block a user