Commit Graph

100 Commits

Author SHA1 Message Date
Tiezhen WANG
fec830a3c8 TFLM: Fix hard crash on Arduino.
Long debug log caused the program to crash.

Also this debug log is not much in use anyway.

PiperOrigin-RevId: 338264544
Change-Id: I940208380074e74a40e21cf22960fdc8f1c00c4d
2020-10-21 10:52:08 -07:00
Jaesung Chung
62feaa576a Update the remaining unchanged spots, which read builtin code
PiperOrigin-RevId: 338183423
Change-Id: Ib208da7e61475165b58dbac3771c9c330ca6f101
2020-10-20 19:30:01 -07:00
Nick Kreeger
b024551db4 Cleanup bug references to TfLiteEvalTensor API and point to bug to add this functionality on the MicroInterpreter.
All kernels have been ported aside from some optimized versions. The only open issues involve adding buffer/TfLiteEvalTensor API functionality to MicroInterpreter. This change simple cleans up references in the allocation and interpreter code.

PiperOrigin-RevId: 336274016
Change-Id: I0707739fa51b40a1621410639779e576b63bfcb7
2020-10-09 05:16:08 -07:00
Nick Kreeger
a7bdaeba61 Drop various buffer pointer getters in SimpleMemoryAllocator.
The API for SimpleMemoryAllocator should just simply set the size of the head buffer and return a pointer to the start of that buffer. The current APIs exposed on this class are confusing and can easily be mixed up. This change drops those getters and relies on publicly facing methods to expose functionality. Additionaly, the head APIs are renamed to make more sense to what they do - manage the single head buffer allocation.

PiperOrigin-RevId: 336273280
Change-Id: Ibd4218c40962946b90633ca55169595057ea46c3
2020-10-09 05:10:03 -07:00
A. Unique TensorFlower
dd934175ec Avoid compiler crash on aggregate initialization of flexible array member
PiperOrigin-RevId: 335754239
Change-Id: Ibc812c55e7e64739a030a6f03976c9c73d799ad2
2020-10-06 17:20:05 -07:00
Nick Kreeger
ba92a24c57 Drop the stack-allocated instance of SimpleMemoryAllocator when committing the memory plan in MicroAllocator.
All allocations are now handled internally in ::CommitStaticMemoryPlan() to improve readability and tracking of allocations.

PiperOrigin-RevId: 335550726
Change-Id: Ia472939d216b950b234e9192fb60206f4a247c91
2020-10-05 19:28:00 -07:00
Nick Kreeger
1454ee0907 Reduce memory overhead with scratch buffer allocations.
This change moves scratch buffer request logic from 3 classes (ContextHelper, MicroInterpreter, MicroAllocator) into simply the MicroAllocator. All member variable instances in ContextHelper are dropped in favor of storing temporary request structs in the head section. When a model finishes allocation, one final allocation of a the ScratchBufferHandle struct is placed in the tail (one allocation per model). All temp request placeholders are dropped in the head section after the model finishes allocation.

PiperOrigin-RevId: 335009075
Change-Id: Ic8e4e821563dd00a65e85f416df791bba778588d
2020-10-02 05:33:53 -07:00
Nick Kreeger
440a655746 Change SimpleMemoryAllocator to always set a specific HEAD allocation size.
The current implementation allows for EnsureHeadSize() to be called several times and ensures that the head is the size of the largest value specified. An upcoming change will reuse the head for storing kernel requested scratch buffer allocations instead of members on class (currently ContextHelper). This change allows for TFLM to adjust the head size, but always revert to the largest model head requirement when memory planning is complete for a model.

PiperOrigin-RevId: 334492115
Change-Id: Ic03e0af7b61acaccd69b2d5aeaea352d201d4c0d
2020-09-29 17:11:28 -07:00
TensorFlower Gardener
202fc9a808 Merge pull request from mansnils:scratch_tensors
PiperOrigin-RevId: 331920973
Change-Id: Ib0b176d8e6b4eac35e8c97bec5f9fa8c8fd41834
2020-09-15 21:13:48 -07:00
Måns Nilsson
2411dcbbf7 TFLu: Update Ethos-U related test with comments from review 2020-09-14 16:56:39 +02:00
Tiezhen WANG
59d177d9ac TFLM: Allow interleaving RequestScratchBuffer and AllocatePersistentBuffer in kernels.
Major changes:
- Scratch buffers are placed in the head during prepare stage then move to the tail once we know its length before static memory plan.
- ContextHelper sends RequestScratchBuffer request in a batch to workaround some limitation with temp allocation during Prepare stage.

PiperOrigin-RevId: 328945674
Change-Id: I09db5c1be0e225904f1c4bf3a5a4a2831a5db438
2020-08-28 08:59:23 -07:00
Måns Nilsson
82a899657a TFLu: Support operator only input tensors.
Ethos-u relies on differentiating between having operator input tensors
as subgraph inputs or not.
2020-08-24 18:14:45 +02:00
Nick Kreeger
d5ed5f9895 Fix and cleanup head adjustment allocation with offsets.
Currently, some platforms that have offsets during allocation (e.g. something on the Sparkfun @ 32bits) will fail to allocate. This is due to how the head was adjusted and the allocation size request in MicroAllocator.cc during memory planning phase (the part that gets committed to head).

First, this change fixes the actual bytes available call by taking in account the offset requested. This is a bug that is exposed with the new adjust head API. All head space was requested as a temp buffer to plan memory usage. This allocation did not account for offsets properly.

Secondly, I've simplified the API for head adjustment. The head is a value that can be set with a given requested size + offset. The watermark logic has been removed in favor of simplicity - callers (e.g. MicroAllocator) should check if they need to increase the head size before adjusting.

PiperOrigin-RevId: 324426138
Change-Id: Ifc683450ba32b9dd9fc5ba587855608a0bc6e311
2020-08-01 15:43:41 -07:00
Tiezhen WANG
6a875d4761 TFLM: Implement the last piece for multi-tenant allocator.
The major change is in SimpleMemoryAllocator to allow the head space to be reused among different models.

PiperOrigin-RevId: 323470479
Change-Id: If709181da5e9b71222742b2850e6b08d25122a49
2020-07-27 16:58:19 -07:00
A. Unique TensorFlower
dd9e169370 Fix 'unused variable' compiler error when building for ARM Cortex M4 in release mode
PiperOrigin-RevId: 323071869
Change-Id: I894b62b4c333838d45e556856f4bfef4f5d98a9c
2020-07-24 14:29:20 -07:00
Advait Jain
983d0003ad sanity -> consistency / smoke.
PiperOrigin-RevId: 322688516
Change-Id: I4d4bdc45f45063ec1a67b812107583899a101cf3
2020-07-22 17:18:31 -07:00
A. Unique TensorFlower
8d46f31b43 Change AllocatePersistentBuffer API to just return a pointer if succeed, or nullptr upon failure
Changes the canonical usage pattern from:

void* data = nullptr;
  if (context->AllocatePersistentBuffer(context, sizeof(OpData), &data) ==
      kTfLiteError) {
    return nullptr;
  }
  return data;

to:

return context->AllocatePersistentBuffer(context, sizeof(OpData);

PiperOrigin-RevId: 322452971
Change-Id: If5de6a44978ce464b33605b8ed186d9767e0716d
2020-07-21 15:46:23 -07:00
Nick Kreeger
4f6c86b163 Switch TF Micro to use TfLiteEval tensors by default.
This change drastically modifies the way memory is used in TF Micro. Currently, large blocks of persistent memory are allocated for TfLiteTensor structs and any associated quantization data. Instead of this pattern, those TfLiteTensor structs and quantization data will be allocated from the "temp" section of the memory arena.

Instead of allocating a large block of TfLiteTensor structs - a minimal TfLiteEval struct is allocated. This new struct will serve as the source of truth for all buffers in the graph.

Everything works in the kernel implementations with this change - they are just temporarily slower. All TfLiteTensor structs fetched from GetInput()/GetOutput()/etc are now allocated on the fly through the temp allocation. Each kernel should be updated to fetch the TfLiteEval struct in the Eval() block in each kernel. Additionally, quantization data should be cached in those op kernels.

This CL saves up to 50% on the arena for larger conv-based models.

PiperOrigin-RevId: 322224278
Change-Id: Id32509a75c9f68177f5bb6b850ea11907afcbb1d
2020-07-20 14:27:53 -07:00
Nick Kreeger
ac2a037cf7 Enable the ability to allocate TfLiteTensor structs from temp and persistent memory.
Upcoming changes to memory allocations will remove the global TfLiteTensor allocation. This change prepares the allocator for internal adjustments to memory requests. When the class fully switches over to TfLiteEvalTensor, the TfLitePtrUnion data buffer will be used instead of the existing large allocation on TfLiteContext.

PiperOrigin-RevId: 321882339
Change-Id: Ia33fe5f3f5f10bb5fce3f4a78fbc4e97a4021dae
2020-07-17 17:06:37 -07:00
Advait Jain
c29d6434ba Make warnings in the external builds match the internal builds.
PiperOrigin-RevId: 321843582
Change-Id: I5dc287411e2c5067b530bb47c0cb24a5e47973fd
2020-07-17 13:40:16 -07:00
Nick Kreeger
f57f560b1a Enable the allocation of single TfLiteTensor structs from temp memory.
In the future, all TfLiteTensor structs should be allocated through this API. This allocation allows for a chain of TfLiteTensor objects that can be reset through "ResetTempAllocations()".

PiperOrigin-RevId: 321211032
Change-Id: I6ab86b8749338590f1457486aa81a39e036534ec
2020-07-14 12:50:26 -07:00
Advait Jain
78e1d0f299 Remove unnecessay op_type parameter from the builtin parse functions.
Now that TFLM has completely switched over to the selective registration of
builtin parse functions, we can remove the unnecessary additional parameter.

PiperOrigin-RevId: 319072289
Change-Id: I4a43953e73c54e05b1d9f815bb8cf0605dc45bb8
2020-06-30 12:20:07 -07:00
Nick Kreeger
aa1499f356 Drop InitGraphAndContextTensorData() from MicroAllocator.
This method is trivial and this file is easier to trace by just calling the two methods that it calls instead.

PiperOrigin-RevId: 318357169
Change-Id: Ib75aaaf67f4aa6908e5aabc7ce0fb7a84a87608e
2020-06-25 15:14:26 -07:00
Nick Kreeger
3ed1e3029e Remove static_assert for type checking in FlatBufferVectorToTfLiteTypeArray.
It turns out that std::is_same() has dropped the non-string argument in c++17. This breaks internal users that are building against qualcomm.

PiperOrigin-RevId: 317790812
Change-Id: If56a61d20426670251b55f370a6b5fa886a49e21
2020-06-22 20:42:21 -07:00
Nick Kreeger
a3dc11ea13 Add endian-aware flatbuffer conversions to MicroAllocator.
Currently, MicroAllocator manually maps TfLite*Array struct values directly to flatbuffer values. This change cleans up other instances inside MicroAllocator that are not endian-aware.

This works only on little-endian (LE) architecture systems because of the layout of TfLite*Array:

struct TfLiteIntArray {
  int size;
  int data[];
}

The compiler maintains mapping, but |size| and |data| are laid out as the following in LE:

[lowest-order-byte(e.g. data) .... highest-order-byte(e.g. size)]

Casting and remapping work on LE because the vector is written in the lowest-order-byte sequence. On BE systems, this memory savings trick does not work and requires a malloc from the arena and manual copying of values from the flatbuffer.

PiperOrigin-RevId: 317730072
Change-Id: I1baff898356e3d82b2faed6468a50ae44acd3082
2020-06-22 14:08:31 -07:00
Nick Kreeger
75a3975ab8 Fix c-style casts and const usage in MicroAllocator.
This change was introduced in cl/316533499 (PR: https://github.com/tensorflow/tensorflow/pull/38121). Lint was complaining of c-style casts, upon fixing it also was hiding const usage.

PiperOrigin-RevId: 317680917
Change-Id: I4d874564875e58eb5f6905c7b75562f90588bb22
2020-06-22 11:14:28 -07:00
Nick Kreeger
072c2f5d0d Reduce excessive RAM in TFLM by using the existing flatbuffer quantization data for scales.
Currently, TFLM manually allocates a tail chunk to store "quantization" tensor data on TfLiteTensor objects. The size of these allocations vary based on the type of model - conv1d/2d models tend to be rich since quantization data is stored "per channel".

This change simply points the scale data at the existing value in the flatbuffer. The flatbuffer schema stores float values as flatbuffers::Vector<float> and the TfLiteAffineQuantization struct can point the scale pointer at these values. Unfortunately, the zero point values are stored as flatbuffers::Vector<int64_t> and can not be reused. This allocation will be addressed in a future change.

Keyword Model ~2% reduction in tail allocation:
-----------------------------------------------
[RecordingMicroAllocator] Arena allocation total 21040 bytes
[RecordingMicroAllocator] Arena allocation head 672 bytes
[RecordingMicroAllocator] Arena allocation tail 20368 bytes
[RecordingMicroAllocator] 'TfLiteTensor struct' used 6048 bytes with alignment overhead (requested 6048 bytes for 54 tensors)
[RecordingMicroAllocator] 'TfLiteTensor quantization data' used 1728 bytes with alignment overhead (requested 1728 bytes for 108 allocations)
[RecordingMicroAllocator] 'TfLiteTensor variable buffer data' used 10240 bytes with alignment overhead (requested 10240 bytes for 7 allocations)
[RecordingMicroAllocator] 'NodeAndRegistration struct' used 1200 bytes with alignment overhead (requested 1200 bytes for 15 NodeAndRegistration structs)
[RecordingMicroAllocator] 'Operator runtime data' used 148 bytes with alignment overhead (requested 148 bytes for 13 OpData structs)

Test Conv Model ~10% reduction in tail allocation:
-----------------------------------------------
[RecordingMicroAllocator] Arena allocation total 11680 bytes
[RecordingMicroAllocator] Arena allocation head 7744 bytes
[RecordingMicroAllocator] Arena allocation tail 3936 bytes
[RecordingMicroAllocator] 'TfLiteTensor struct' used 1680 bytes with alignment overhead (requested 1680 bytes for 15 tensors)
[RecordingMicroAllocator] 'TfLiteTensor quantization data' used 768 bytes with alignment overhead (requested 752 bytes for 24 allocations)
[RecordingMicroAllocator] 'TfLiteTensor variable buffer data' used 0 bytes with alignment overhead (requested 0 bytes for 0 allocations)
[RecordingMicroAllocator] 'NodeAndRegistration struct' used 560 bytes with alignment overhead (requested 560 bytes for 7 NodeAndRegistration structs)
[RecordingMicroAllocator] 'Operator runtime data' used 136 bytes with alignment overhead (requested 136 bytes for 5 OpData structs)

PiperOrigin-RevId: 316556393
Change-Id: Iadadab51019d2787d11af9713b3639f087afa7bc
2020-06-15 15:27:20 -07:00
Jens Elofsson
708ecda43e Merge remote-tracking branch 'upstream/master' into offline_memory_planner 2020-06-15 10:06:36 +02:00
Nick Kreeger
cd6a929e30 Track variable tensor buffer allocation in the "recording" MicroAllocator.
The RecordingMicroAllocator class currently doesn't track variable tensor allocations. This was noted why the measured allocations had a missing ~10kb of tail space unaccounted for in the keyword model. This change tracks variable tensor allocation for the keyword model (the test conv model does not have any variable tensors).

Total and tail allocation creep up a bit here to handle the additional fields in RecordingMicroAllocator:

TestKeywordModelMemoryThreshold:
-------------------------------
[RecordingMicroAllocator] Arena allocation total 21472 bytes
[RecordingMicroAllocator] Arena allocation head 672 bytes
[RecordingMicroAllocator] Arena allocation tail 20800 bytes
[RecordingMicroAllocator] 'TfLiteTensor struct' used 6048 bytes with alignment overhead (requested 6048 bytes for 54 tensors)
[RecordingMicroAllocator] 'TfLiteTensor quantization data' used 2160 bytes with alignment overhead (requested 2160 bytes for 162 allocations)
[RecordingMicroAllocator] 'TfLiteTensor variable buffer data' used 10240 bytes with alignment overhead (requested 10240 bytes for 7 allocations)
[RecordingMicroAllocator] 'NodeAndRegistration struct' used 1200 bytes with alignment overhead (requested 1200 bytes for 15 NodeAndRegistration structs)
[RecordingMicroAllocator] 'Operator runtime data' used 148 bytes with alignment overhead (requested 148 bytes for 13 OpData structs)

TestConvModelMemoryThreshold:
-----------------------------
[RecordingMicroAllocator] Arena allocation total 12128 bytes
[RecordingMicroAllocator] Arena allocation head 7744 bytes
[RecordingMicroAllocator] Arena allocation tail 4384 bytes
[RecordingMicroAllocator] 'TfLiteTensor struct' used 1680 bytes with alignment overhead (requested 1680 bytes for 15 tensors)
[RecordingMicroAllocator] 'TfLiteTensor quantization data' used 1216 bytes with alignment overhead (requested 1216 bytes for 36 allocations)
[RecordingMicroAllocator] 'TfLiteTensor variable buffer data' used 0 bytes with alignment overhead (requested 0 bytes for 0 allocations)
[RecordingMicroAllocator] 'NodeAndRegistration struct' used 560 bytes with alignment overhead (requested 560 bytes for 7 NodeAndRegistration structs)
[RecordingMicroAllocator] 'Operator runtime data' used 136 bytes with alignment overhead (requested 136 bytes for 5 OpData structs)
PiperOrigin-RevId: 316166016
Change-Id: I7d806f901b39e5d6a73c3baaf11d85fa7f6e17b1
2020-06-12 13:39:35 -07:00
Jens Elofsson
2f9642602d Adapt to changes in micro_allocator. 2020-06-12 10:49:57 +02:00
Nick Kreeger
b6d13bb0a8 Remove TF Micro tests that use the "name" field in TfLiteTensor.
The TFLM team is preparing to provide an "optimized" memory build option. This build option will eliminate non-needed/essential fields from core TFLite structs. The first big change is to reduce the number of pointers on TfLiteTensor. Many models have multiple tensors (e.g. benchmark keyword has 54) and each pointer adds up for TFLM. This cleanup pass removes the soon to be un-used 'name' field from TfLiteTensor.

PiperOrigin-RevId: 316000388
Change-Id: I230865014d5a59b78c1c1c9f5eda784f6d611e77
2020-06-11 16:42:03 -07:00
Nick Kreeger
d8881eb71d Add a memory threshold allocation test for the Keyword model.
This new test ensures that TF Micro does not regress current allocations (on x86-64 systems) for a canonical model. As RAM reduction changes are introduced, the values in this test can be updated from the console log of this test.

Current output for the keyword model:
Testing TestKeywordModelMemoryThreshold
[RecordingMicroAllocator] Arena allocation total 21440 bytes
[RecordingMicroAllocator] Arena allocation head 672 bytes
[RecordingMicroAllocator] Arena allocation tail 20768 bytes
[RecordingMicroAllocator] 'TfLiteTensor struct allocation' used 6048 bytes (requested 6048 bytes 54 times)
[RecordingMicroAllocator] 'TfLiteTensor quantization data allocations' used 2160 bytes (requested 2160 bytes 162 times)
[RecordingMicroAllocator] 'NodeAndRegistration struct allocation' used 1200 bytes (requested 1200 bytes 15 times)
[RecordingMicroAllocator] 'Operator runtime data allocation' used 148 bytes (requested 148 bytes 13 times)

PiperOrigin-RevId: 315958032
Change-Id: I226f6a01aa555970805388632559241a41ff8342
2020-06-11 12:54:09 -07:00
Rajeshwar Reddy T
72715d61cf
Merge branch 'master' into offline_memory_planner 2020-06-10 17:54:31 -07:00
Nick Kreeger
26ee75e596 Decouple the model and TfLiteContext instance from the allocator and interpreter.
This change simplifies the interaction between the MicroInterpreter and MicroAllocator. All allocation for a given model is staged in MicroAllocator.StartModelAllocation() and MicroAllocator.FinishModelAllocation().

This change prepares for two upcoming features:
1.) Multi-tenant memory arena
2.) An easy-to-use RecordingMicroInterpreter to allow auditing of recorded memory arena allocations.

PiperOrigin-RevId: 315736762
Change-Id: Ia9da1f6edcd1001e3aad975c117905054f172e18
2020-06-10 12:12:35 -07:00
Jens Elofsson
dc3c76758e Merge remote-tracking branch 'upstream/master' into offline_memory_planner 2020-06-10 09:17:13 +02:00
TensorFlower Gardener
e8bf41c3c6 Merge pull request from frreiss:issue-s390-lite-4
PiperOrigin-RevId: 315534740
Change-Id: I69b5813ebfe81f956c79dcae5f3f7427a5e592c0
2020-06-09 12:33:38 -07:00
Nick Kreeger
1a90749db9 Enable the ability to pass a MicroAllocator instance into a MicroInterpreter instance.
This change is a stepping stone to enable users to:
1.) Enable users to use a single MicroAllocator/arena for multiple models.
2.) Enable users to use the new recording allocation APIs for auditing arena allocations.

PiperOrigin-RevId: 315414448
Change-Id: Ied1ea56deb73c09bb64b3e41fd3502b5a4cd5bb8
2020-06-08 21:27:47 -07:00
Jens Elofsson
8eee6751af Merge remote-tracking branch 'upstream/master' into offline_memory_planner 2020-06-08 14:42:01 +02:00
Jens Elofsson
bacc5e5927 Fix build issues. 2020-06-08 14:38:12 +02:00
frreiss
4969b3b501 Merge branch 'master' of https://github.com/tensorflow/tensorflow into issue-s390-lite-4 2020-06-06 21:47:23 -07:00
frreiss
a5b24dc2d1 Add static_cast to comparison with unsigned int 2020-06-06 21:47:18 -07:00
Nick Kreeger
9d572b8d5e Introduce a "recording" MicroAllocator class.
This new class enables TFLM to measure, audit, and report memory usage in the shared tensor arena. Users may opt into this class by simply passing this class into a MicroInterpreter instance.

PiperOrigin-RevId: 314995667
Change-Id: I6a451944d55b0498a98f1cfd54244f9008e578d2
2020-06-05 14:25:46 -07:00
Advait Jain
d54578dba2 Add APIs to enable selective registration of the builtin parse function.
With this CL:
 * We have the hooks needed to register an operator specific parse function with
   MicroMutableOpResolver and the retrieve it without ParseOpData being used.

 * This CL is still passing in ParseOpData as the operator specific parse
   function and that will be changed in a follow-on CL.

PiperOrigin-RevId: 314982707
Change-Id: I174259aabd66e97184a8a282832f6c71580366c9
2020-06-05 13:13:27 -07:00
Jens Elofsson
b1e74b227c Fix compile errors. 2020-06-04 09:22:37 +02:00
frreiss
dfd48bd61d Merge branch 'master' of https://github.com/tensorflow/tensorflow into issue-s390-lite-4 2020-05-29 17:42:25 -07:00
Nick Kreeger
7d0ab61788 Add special "recording" SimpleMemoryAllocator class to help with logging tail allocations.
This new helper class will enable TFLM to log and record where the allocations in the shared arena are going. A future change will use this new class in a special "recording" MicroAllocator subclass. All these logging mechanisms will be opt-in by code.

PiperOrigin-RevId: 313843072
Change-Id: I3fc9205e475e89b4a3795c3cc79c31d2166da2c8
2020-05-29 14:22:59 -07:00
Advait Jain
33689c48ad Add MicroOpResolver interface class.
This will allow us to implement selective registration of the builtin parse
functions without changing the OpResolver base class in TFLite.

* MicroOpResolver is now an interface (matching the OpResolver name in TFLite).
* MicroMutableOpResolver is the implementation of the MicroOpResolver
  interface that should be used by applications that do not want to use
  AllOpsResolver.

PiperOrigin-RevId: 313691276
Change-Id: I0a9f51f6584326a3b3dd645cde083ba42116083d
2020-05-28 17:38:37 -07:00
Jens Elofsson
f409152691 Merge remote-tracking branch 'upstream/master' into offline_memory_planner 2020-05-28 20:24:46 +02:00
A. Unique TensorFlower
290487b03e Fix for build errors with constexpr TfLiteIntArray.
PiperOrigin-RevId: 313599824
Change-Id: Ia37465dd2f782e234839bdfbe991516d9fc06c40
2020-05-28 09:22:20 -07:00
Jens Elofsson
8b0f5d1d12 Merge remote-tracking branch 'upstream/master' into offline_memory_planner 2020-05-27 20:19:59 +02:00