Remove profiling dependency in common header to help TF Lite Micro portability

The background to this is that the TensorFlow Lite version for MCUs has to be very sparing with the external headers it brings in, since a lot of libraries like pthreads, or even standard C/C++ library routines like malloc, may not be available on embedded platforms. We're trying to keep the core code as shared as possible, which means that headers like kernels/internal/common.h are used by both regular and micro flavors of TFL. Unfortunately it's not clearly documented which headers are shared like this, so it's easy to accidentally introduce a breaking dependency in one of them, and that's what happened in this case by relying on gemmlowp's instrumentation header, which then pulls in pthreads. One solution would be to introduce #ifdef TFL_MICRO guards around non-micro-compatible code chunks in places like these, but unfortunately some of our target platforms (like Arduino) don't have easy ways to set macros as part of the build process, so we've tried to avoid that. Another approach we've taken has been to break headers into smaller chunks, and only include the portable parts from micro. In this case, the easiest approach is to remove the profiling call from the one function it was added to, and so remove the instrumentation.h dependency entirely. I'm also open to splitting up the header in a bigger refactor if it's important to keep it? Also, I'm working on getting nightly, and eventually presubmit, CI builds going that test the MCU compilation path, along with more documentation, so we can catch issues earlier. PiperOrigin-RevId: 247780213
2019-05-11 15:59:03 -07:00 · 2019-05-11 15:59:03 -07:00 · 8bec6bfadf
commit 8bec6bfadf
parent a6a5267352
2 changed files with 0 additions and 3 deletions
--- a/tensorflow/lite/kernels/internal/BUILD
+++ b/tensorflow/lite/kernels/internal/BUILD
@ -175,7 +175,6 @@ cc_library(
    deps = [
        ":types",
        "@gemmlowp//:fixedpoint",
-        "@gemmlowp//:profiler",
    ] + select({
        ":haswell": tflite_deps_intel,
        ":ios_x86_64": tflite_deps_intel,
--- a/tensorflow/lite/kernels/internal/common.h
+++ b/tensorflow/lite/kernels/internal/common.h
@ -46,7 +46,6 @@ limitations under the License.
 #endif

 #include "fixedpoint/fixedpoint.h"
-#include "profiling/instrumentation.h"
 #include "tensorflow/lite/kernels/internal/types.h"

 namespace tflite {
@ -96,7 +95,6 @@ inline void BiasAndClamp(float clamp_min, float clamp_max, int bias_size,
  //   return (array.colwise() + bias).cwiseMin(clamp_max).cwiseMin(clamp_max).
  // This turned out to severely regress performance: +4ms (i.e. 8%) on
  // MobileNet v2 / 1.0 / 224. So we keep custom NEON code for now.
-  gemmlowp::ScopedProfilingLabel label("BiasAndClamp");
  TFLITE_DCHECK_EQ((array_size % bias_size), 0);
 #ifdef USE_NEON
  float* array_ptr = array_data;