STT-tensorflow

Author	SHA1	Message	Date
A. Unique TensorFlower	e647a3b425	Add experimental C API to access EagerContext context ID. PiperOrigin-RevId: 317476439 Change-Id: I9e97bce61cf526695f0c903b5f4f837116fef455	2020-06-20 11:32:20 -07:00
Gaurav Jain	d5b3ec27d1	Allow dynamically configuring device placement Enable setting soft device placement as well as logging dynamically. This required ensuring the device placement policy was part of the cache key. Further, we fix the logging to ensure in eager mode if a kernel is retrieved from the kernel cache, then the execution is still logged. We also log closer to the actual op execution to avoid logging before all checks have been done. PiperOrigin-RevId: 311271808 Change-Id: I9765228894f84a3447cc03332a2559f6d933165b	2020-05-12 23:17:39 -07:00
Yujing Zhang	7e6ea21148	Support running a function with packed input handles through C APIs. Introduce a C API TFE_CreatePackedTensorHandle which creates a TFE_TensorHandle referring to multiple TFE_TensorHandles. PiperOrigin-RevId: 310610230 Change-Id: Icc0ffd5c58ad7780eca38d552c1a2f4617f04891	2020-05-08 12:53:55 -07:00
Allen Lavoie	6e3bea20a1	Less pointer indirection for TFE_OpAttrs, add TFE_OpGetAttrs We'll want this for implementing copy for `TF_AbstractOp`s backed by `TFE_Op`s (since we want to copy the type/attributes but not the inputs). PiperOrigin-RevId: 309756974 Change-Id: I07a8c48f50ab6d3c8a7d7db972fb60202b86434d	2020-05-04 09:24:03 -07:00
Allen Lavoie	e0606af65f	Small cleanups for experimental TFE attribute APIs The op name was included twice, and TFE_OpGetAttrs is unusable without a way to allocate a TFE_OpAttrs on the heap (and so has no callers). I'm removing it for now. PiperOrigin-RevId: 308859222 Change-Id: Ibb3901a1821ffc2e9ebc0efb26592e5b3d8bb88f	2020-04-28 11:20:43 -07:00
Allen Lavoie	d8c89c1cd7	Fix API exports for the experimental TFE_RegisterCustomDevice Extern, plus it was missing TF_CAPI_EXPORT which is probably the main reason it wasn't in the Windows DLL PiperOrigin-RevId: 305795200 Change-Id: I7ab3d847f3f60f71588f19bfa962a861d02bba44	2020-04-09 17:36:59 -07:00
Mihai Maruseac	3f70de8266	Automated rollback of `17b7bc01bc` PiperOrigin-RevId: 305779485 Change-Id: Ifa9eda9d594916b0e9ddb57d52cb69cb534ae56a	2020-04-09 16:07:38 -07:00
Allen Lavoie	17b7bc01bc	Add a way to register custom devices with the Python TFE_Context The API accepts TFE_RegisterCustomDevice arguments as PyCapsules, so each custom device will need some method to create those. Presumably most custom devices will end up wrapping the PyCapsule creation+registration rather than exposing it to the user. No public API yet, but this is roughly what I have in mind at the moment. This only works with --config=monolithic or when the custom device registration is bundled with pywrap_tensorflow.so right now since that has its own copy of the C API. Something like this could work if we switched pywrap_tensorflow.so to instead rely on libtensorflow.so for the C API, then custom device extensions could link against that. PiperOrigin-RevId: 305762978 Change-Id: I4d2d9bd9c01ba22391e138244a3948bae8963c5c	2020-04-09 14:42:08 -07:00
Gaurav Jain	9b576164f1	Add Tensor & TensorHandle C APIs taking a Context The existing TF_AllocateTensor & TFE_NewTensorHandle APIs do not take a TFE_Context which is undesirable as the TFE_Context indicates ownership of the tensor. Thus we add new APIs to super-seed the existing ones. PiperOrigin-RevId: 305126310 Change-Id: I9863ebc692d48875c61b79197ab418f29503a8c6	2020-04-06 15:10:09 -07:00
Gaurav Jain	907a55ebad	Remove implicit mirroring toggle Implicit mirroring is set to true by default already and is essential for eager performance. This CL just removes dead code since there is no API to disable mirroring for tensors. We also shouldn't have this in the TensorHandleInterface class since mirroring is a runtime-specific implementation detail. PiperOrigin-RevId: 304421014 Change-Id: I383fa24da08a86028cabb3a4b1c5f2612d57336d	2020-04-02 09:58:19 -07:00
Gaurav Jain	857f0c9557	Add option to enable tfrt eager context PiperOrigin-RevId: 303254195 Change-Id: Ibee9c3a9cb4f0abf2e1738ed09c7a9ec326b5b64	2020-03-26 21:12:29 -07:00
Allen Lavoie	1e59f1e54d	Custom devices: devices take a TFE_Context explicitly This will be useful for switching between graph building and eager execution (although that may need a different context type), but also gives us the option to pass a custom device representation into language bindings without requiring them to expose their TFE_Context directly (they still expose it to the custom device when executing operations). PiperOrigin-RevId: 300630552 Change-Id: I41083c63db1b137af60f932114f1fcaae8ac2eb0	2020-03-12 15:15:03 -07:00
Allen Lavoie	28b4039a1a	Custom devices: add a TF_Status return, disallow duplicate registrations PiperOrigin-RevId: 300159589 Change-Id: I4e8cfcdc54999c04a41c7351f7b016a85234ac0d	2020-03-10 13:03:02 -07:00
Haoyu Zhang	957792181b	Introduce async_wait and async_clear_error primitives. Add tests to demonstrate the usage of the primitives in handling exceptions thrown in remote async execution. PiperOrigin-RevId: 297041596 Change-Id: Ibc9ffa7c5eaaa9b62c6849e815c0c933ff0ec86c	2020-02-24 22:07:20 -08:00
Allen Lavoie	ed52a7c6f5	Add an eager C API for deserializing generic op attributes, TFE_OpSetAttrValueProto (like TF_SetAttrValueProto for graph building) Also adds an experimental eager C API for serializing op attributes as generic name->value mappings It's a bit sad that being generic requires serializing here, but I don't see a great way around it if the attributes will be used generically (e.g. to build a FunctionDef). We can add special cases that don't require serialization for fetching attributes when the type is known. PiperOrigin-RevId: 297003316 Change-Id: Id6e65bc7a8178fbbb8a85a542bd31def08225fe6	2020-02-24 17:00:11 -08:00
Gaurav Jain	6bf2895298	Reduce overhead of protecting tensors for eager The eager executor tried to prevent forwarding of any input tensors by incrementing the reference count of any "non-consumed" inputs. This involved highly delicate logic which first signaled "non-consumed" inputs as those with a reference count greater than 1 (1 from python and another from the EagerOperation class), which require "protecting" by incrementing underlying tensor buffer. This logic is highly heavyweight for the common case of synchronous execution. We thus simplify the logic by having all TensorHandle Tensors protected at construction and "unprotect" then if the reference count is 1. - Hold 2 reference counts a TensorHandle's backing Tensor. This protects the Tensor from being forwarded. - Add the ability to unprotect a TensorHandle's backing Tensor when the reference count is 1. - Split ExecuteNode into Async implementation. The sync ExecuteNode class can avoid various copies such as the list of inputs and the forwarding map. - Remove the experimental TFE_OpConsumeInput API. Input forwarding can be achieved by releasing the handle after calling TFE_OpAddInput as demonstrated by the added tests. - Fix TF_AllocateTensor to return a forwardable tensor it was previously disabled due to re-using the logic in TF_NewTensor. - Save mirror tensor when calling TFE_TensorHandleResolve. PiperOrigin-RevId: 296225251 Change-Id: I484cfccbef8b44e82757b8bda0981cd7fd2f8096	2020-02-20 09:19:23 -08:00
Allen Lavoie	1f5bc8a979	Add an experimental eager C API for generically fetching and setting op attributes. Right now you can only fetch the whole attribute map and set it wholesale, but we can add more fine-grained attribute control in the future. This allows the custom device API to pass in attributes, and custom devices to forward these to their own TFE_Execute calls. This is required for creating variables. PiperOrigin-RevId: 296096192 Change-Id: I98c23bdcd13e479235b3e27850b1bb0bd7a53bba	2020-02-19 17:56:18 -08:00
Akshay Modi	fa5cdeae7e	Add a functiondef getter to the context PiperOrigin-RevId: 296002833 Change-Id: I238a2984a9320c084b7157e6eeb30b30aa132036	2020-02-19 10:48:38 -08:00
Jose Baiocchi	767e4d5dab	Move profiler API implementation to _pywrap_profiler PiperOrigin-RevId: 295240754 Change-Id: I3664efc053696a3c521d18527c04747688cac932	2020-02-14 15:50:39 -08:00
Gaurav Jain	b41b89dc75	Add local mirroring support to tensor handles We allow a TensorHandle to reference multiple tensors on the local host. This allows us to essentially cache any implicit copies that occur before executing an op. This helps avoid repeated copies if a tensor is constantly fed to an op on a different device. Additional clean-ups: - Move CustomDevice TensorHandle constructor to separate constructor - If the TensorHandle is on the host CPU device, ensure that device_ is set to nullptr. - Clean up CAPI test to use ASSERT_EQ instead of ASSERT_TRUE PiperOrigin-RevId: 294180977 Change-Id: I26892e9058973eebac557fc529b46de793418e12	2020-02-10 02:47:37 -08:00
Jose Baiocchi	21bb9be2c1	Decouple ProfilerSession wrapper from pywrap_tfe PiperOrigin-RevId: 293882403 Change-Id: I947e32807447460b6fc7ca1b19bf9ca276c3e994	2020-02-07 13:33:01 -08:00
Haoyu Zhang	bdba822d97	Adding barrier message to clear remote executors in order to support catching OutOfRangeErrors. PiperOrigin-RevId: 293716720 Change-Id: I0768c99baf080f817e0985e188ffe330b3e15dcc	2020-02-06 17:44:52 -08:00
Allen Lavoie	a4064a389e	Experimental API for custom devices in TFE. Custom devices are an experimental hook into eager op execution, allowing experimentation outside the TensorFlow codebase. These devices do not work in traced code at the moment. PiperOrigin-RevId: 293615055 Change-Id: I031da213e964caa7d4e11e0f491a3985d034b175	2020-02-06 10:00:55 -08:00
Maher Jendoubi	215dab52c6	Contributing: fix typos	2020-01-26 13:47:00 +01:00
Dong Lin	7bfb8380a7	Place all py_func op on the local host's address space if eager execution is enabled. PiperOrigin-RevId: 290993424 Change-Id: I0c33cdf781fa4b3c401ea5e8649f606137e42862	2020-01-22 11:28:15 -08:00
Gaurav Jain	77aaa1ef2d	Move functionality from TFE_Op to EagerOperation A lot of functionality in TFE_Op was simply a pass-through to EagerOperation. We instead want the TFE_Op to be a simple struct and have the functionality defined in the operation member. The following changes were made: - Remove a pointer to the TFE_Context in TFE_Op as the context is stored in EagerOperation. - Modify the constructor of EagerOperation to only take a EagerContext pointer and require the caller to call Reset. This allows callers to handle any errors from construction. - We expect the context to not be null. We enforce this with references and clean up the code to ensure that an eager context is never reset with a different context. As a result the `ctx` parameter has been removed from TFE_OpReset. - Move OpInferenceContext into EagerOperation PiperOrigin-RevId: 290386452 Change-Id: I3ffb62b01dce230ddc555d84d6ae39fd4ec90b2f	2020-01-17 20:12:26 -08:00
A. Unique TensorFlower	f80f6c6056	Place all py_func op on the local host's address space. PiperOrigin-RevId: 290008258 Change-Id: If68f84ed37f83ed0aac0689df70e8df69a2d256f	2020-01-15 23:35:10 -08:00
Dong Lin	24ceca6744	Place all py_func op on the local host's address space. PiperOrigin-RevId: 290005443 Change-Id: I7294676d17d6e2f37fc939bd9d685d71aad8feeb	2020-01-15 23:00:23 -08:00
A. Unique TensorFlower	f18ffa8204	Place all py_func op on the local host's address space. PiperOrigin-RevId: 289903686 Change-Id: I38f3b8020cea5b3eab1e5d9141c32350473dadfa	2020-01-15 11:44:57 -08:00
Dong Lin	30936d89ac	Place all py_func op on the local host's address space. PiperOrigin-RevId: 289883431 Change-Id: I5990df1fa6825729dcd843e708574451bc16111d	2020-01-15 10:15:16 -08:00
Amit Patankar	50fae6026e	Decouple ProfilerSession wrapper from pywrap_tfe PiperOrigin-RevId: 286641101 Change-Id: Ic5046b977d1b42ed6a1e9038e3b6ec40a0a82e2f	2019-12-20 14:37:32 -08:00
Jose Baiocchi	df6698712a	Decouple ProfilerSession wrapper from pywrap_tfe PiperOrigin-RevId: 286609127 Change-Id: Ic70e27ad3820a2f1399dc414ded40bef811e5653	2019-12-20 11:15:12 -08:00
Alexandre Passos	3f8a370b5a	Allow accessing the GPU device memory from the TF C API. Addresses #34846; might help with #24453. PiperOrigin-RevId: 284921061 Change-Id: I2b31e474bf961f731f67a85aad39bfeaefd3998a	2019-12-10 22:47:39 -08:00
A. Unique TensorFlower	495e179730	[Perf] Skip EagerOperation::SetDeviceName(...) call if input device name didn't change. PiperOrigin-RevId: 284700133 Change-Id: I7716abe6968b0686df00ea15dec3d85bf16e8cf5	2019-12-09 22:04:13 -08:00
Yuefeng Zhou	33a2ba1b47	Add a check_alive to context to check whether a remote worker is alive. PiperOrigin-RevId: 280080043 Change-Id: Id152c198ebf20256fc14b2ea1e16b8c5db71844c	2019-11-12 16:22:31 -08:00
Yujing Zhang	206d6af149	Add lazy_remote_inputs_copy to TFE_ContextOptions to control lazy remote tensor copy. Disable it by default. PiperOrigin-RevId: 279212487 Change-Id: Ie46de71fd2902b79281e6257ff28c06d9aaa73d4	2019-11-07 18:35:44 -08:00
Haoyu Zhang	7349bf5e09	Support adding / removing servers when executing distributed ops and functions. Introduce `update_server_def`, which support running remote ops and functions with dynamic cluster membership in a cluster. The client will register new contexts on the newly added workers, remove old contexts from the removed servers, and rebuild the connections between workers for proper communication. PiperOrigin-RevId: 271234187	2019-09-25 19:46:26 -07:00
Yujing Zhang	b1efc03535	Reuse the same tensorflow::EagerOperation object across multiple ops in same thread by adding a Reset method PiperOrigin-RevId: 268942974	2019-09-13 11:46:47 -07:00
Xiao Yu	4411b77626	Executor api clean up: 1. Remove async_wait() and async_clear_error() in EagerContext. 2. Allow getting current executor from EagerContext. 3. Remove StartAsync() method in EagerExecutor. PiperOrigin-RevId: 262445965	2019-08-08 16:34:52 -07:00
Derek Murray	086bf1c5d1	Change name of TFE_NewExecutor() argument to `is_async`. This fixes a breakage on Python 3.7+, where the SWIG wrapper uses the reserved keyword `async` as a parameter name. This was recently fixed in https://github.com/swig/swig/pull/1382. PiperOrigin-RevId: 260301284	2019-07-27 08:36:27 -07:00
Xiao Yu	77cc4bcd61	Adds new python APIs which allows specifying an eager executor for current thread. This change also use a new Executor to execute pyfunc, which can avoid pyfunc deadlock in async mode. PiperOrigin-RevId: 260181580	2019-07-26 11:51:32 -07:00
A. Unique TensorFlower	4024cedbc1	Remove ProfilerContext (no longer used) PiperOrigin-RevId: 259983256	2019-07-25 11:41:48 -07:00
Derek Murray	27fe47055e	Add experimental implementation of cancelable eager function execution. The experimental interface uses `cancellation.CancellationManager`: ```python c_mgr = cancellation.CancellationManager() @tf.function def f(?): ? cancelable_f = c_mgr.get_cancelable_function(f.get_concrete_function(?)) # Call a function that might run for a long time. cancelable_f(?) # Asynchronously: c_mgr.start_cancel() ``` A subsequent change will add a publicly-accessible (probably experimental) API endpoint for `CancellationManager`. PiperOrigin-RevId: 258648702	2019-07-17 15:11:54 -07:00
Gaurav Jain	1ea9f63103	Export TFE_ContextOptionsSetMirroringPolicy Additionally - Move functions to eager/c_api_experimental.cc - Misc lint fixes PiperOrigin-RevId: 256506413	2019-07-04 01:16:01 -07:00
Derek Murray	987046e078	Add a SWIG wrapper for the `tensorflow::CancellationManager` class. This change is a step towards supporting user-driven cancellation for eager function calls. In a future change, I plan to add an experimental method for calling a `tf.function` and passing a `CancellationManager` argument, so that the caller can cancel execution asynchronously. PiperOrigin-RevId: 256369003	2019-07-03 08:28:39 -07:00
Gaurav Jain	e75d8dc058	Add mirroring for remote tensor handles When executing on a remote worker, we may have to copy the TensorHandle for each executed op. To avoid duplicated work, we expand the TensorHandle to keep track of mirrors which are tied to the lifetime of the TensorHandle. If a mirror already exists on a remote worker, no additional copy is needed. The change consists of the following: - Add map of remote mirrors in TensorHandle. - Add `mirror` boolean argument to EagerCopyToDevice which indicates to try configuring a mirror if possible. - Add Device argument to RemoteAddress to handle mirrors. - Expose a ContextMirroringPolicy for the EagerContext. We plan to add additional policies in the future, such as local tensor mirroring. - Rename ContextDevicePlacementPolicy variables to be consistent with ContextMirroringPolicy. PiperOrigin-RevId: 253945140	2019-06-19 00:39:29 -07:00
Xiao Yu	6c5d79930c	Fix an issue that start_profiler_server complains 'AssertionError: Context must be initialized first.' PiperOrigin-RevId: 253093414	2019-06-13 13:31:32 -07:00
A. Unique TensorFlower	514004a234	Add a StartMonitoring Python API. PiperOrigin-RevId: 253079116	2019-06-13 12:18:31 -07:00
A. Unique TensorFlower	80ad7b024a	Move TF_Status to the last argument of StartTracing api PiperOrigin-RevId: 251497647	2019-06-04 13:12:00 -07:00
A. Unique TensorFlower	74a9032d8c	Add status to StartTracing API. PiperOrigin-RevId: 251299535	2019-06-03 14:40:01 -07:00

1 2

63 Commits