TensorFlow 2.18 Changes
TensorFlow 2.18 has been released! Highlights of this release (and 2.17) include NumPy 2.0, LiteRT repository, CUDA Update, Hermetic CUDA, and more. For the full release notes, visit the TensorFlow 2.18 Github here.
NumPy 2.0
The upcoming TensorFlow 2.18 release will include support for NumPy 2.0. While most TensorFlow APIs will function seamlessly with NumPy 2.0, this may break some edge cases of usage, e.g., out-of-boundary conversion errors and numpy scalar representation errors. You can consult the following common solutions.
NumPy's type promotion rules have been changed (See NEP 50 for details). This may change the precision at which computations happen, leading either to type errors or to numerical changes to results. Please see the NumPy 2 migration guide.
TensorFlow updated some TensorFlow tensor APIs to maintain compatibility with NumPy 2.0 while preserving the out-of-boundary conversion behavior in NumPy 1.x.
LiteRT Repository
TensorFlow is making changes to how LiteRT (formerly known as TFLite) is developed. Over the coming months, they'll be gradually transitioning TFLite's codebase to LiteRT. Once the migration is complete, they'll start accepting contributions directly through the LiteRT repository. There will no longer be any binary TFLite releases and developers should switch to LiteRT for the latest updates.
Hermetic CUDA
If you build TensorFlow from source, Bazel will now download specific versions of CUDA, CUDNN, and NCCL distributions, and then use those tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions because the build no longer relies on the locally installed versions. More details are provided here.
CUDA Update
TensorFlow binary distributions now ship with dedicated CUDA kernels for GPUs with a compute capability of 8.9. This improves the performance on popular Ada-Generation GPUs like NVIDIA RTX 40**, L4, and L40/L40S.
To keep Python wheel sizes in check, they decided to no longer ship CUDA kernels for compute capability 5.0. That means the oldest NVIDIA GPU generation supported by the precompiled Python packages is now the Pascal generation (compute capability 6.0). For Maxwell support, they either recommend sticking with TensorFlow version 2.16, or compiling TensorFlow from source. The latter will be possible as long as the used CUDA version still supports Maxwell GPUs.
TensorFlow 2.18 Other Changes & Bugfixes
Breaking Changes
- tf.lite
- Interpreter:
- tf.lite.Interpreter gives warning of future deletion and a redirection notice to its new location at ai_edge_litert.interpreter. See the migration guide for details.
- C API:
- An optional, fourth parameter was added TfLiteOperatorCreate as a step forward toward a cleaner API for TfLiteOperator. Function TfLiteOperatorCreate was added recently, in TensorFlow Lite version 2.17.0, released on 7/11/2024, and we do not expect there will be much code using this function yet. Any code breakages can be easily resolved by passing nullptr as the new, 4th parameter.
- Interpreter:
- TensorRT support is disabled in CUDA builds for code health improvement.
- Hermetic CUDA support is added.
Hermetic CUDA uses a specific downloadable version of CUDA instead of the user's locally installed CUDA. Bazel will download CUDA, CUDNN and NCCL distributions, and then use CUDA libraries and tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions.
Bug Fixes and Other Changes
- tf.data
- Add optional synchronous argument to map, to specify that the map should run synchronously, as opposed to be parallelizable when options.experimental_optimization.map_parallelization=True. This saves memory compared to setting num_parallel_calls=1.
- Add optional use_unbounded_threadpool argument to map, to specify that the map should use an unbounded threadpool instead of the default pool that is based on the number of cores on the machine. This can improve throughput for map functions which perform IO or otherwise release the CPU.
- Add tf.data.experimental.get_model_proto to allow users to peek into the analytical model inside of a dataset iterator.
- tf.lite
- Dequantize op supports TensorType_INT4.
- This change includes per-channel dequantization.
- Add support for stablehlo.composite.
- EmbeddingLookup op supports per-channel quantization and TensorType_INT4 values.
- FullyConnected op supports TensorType_INT16 activation and TensorType_Int4 weight per-channel quantization.
- Dequantize op supports TensorType_INT4.
- tf.tensor_scatter_update, tf.tensor_scatter_add and of other reduce types.
- Support bad_indices_policy.
Keras 3.6 Changes
Highlights
- New file editor utility: keras.saving.KerasFileEditor. Use it to inspect, diff, modify and resave Keras weights files. See basic workflow here.
- New keras.utils.Config class for managing experiment config parameters.
Breaking Changes
- When using keras.utils.get_file, with extract=True or untar=True, the return value will be the path of the extracted directory, rather than the path of the archive.
Other Changes and Additions
- Logging is now asynchronous in fit(), evaluate(), predict(). This enables 100% compact stacking of train_step calls on accelerators (e.g. when running small models on TPU).
- If you are using custom callbacks that rely on on_batch_end, this will disable async logging. You can force it back by adding self.async_safe = True to your callbacks. Note that the TensorBoard callback isn't considered async safe by default. Default callbacks like the progress bar are async safe.
- Added keras.saving.KerasFileEditor utility to inspect, diff, modify and resave Keras weights file.
- Added keras.utils.Config class. It behaves like a dictionary, with a few nice features:
- All entries are accessible and settable as attributes, in addition to dict-style (e.g. config.foo = 2 or config["foo"] are both valid)
- You can easily serialize it to JSON via config.to_json().
- You can easily freeze it, preventing future changes, via config.freeze().
- Added bitwise numpy ops:
- bitwise_and
- bitwise_invert
- bitwise_left_shift
- bitwise_not
- bitwise_or
- bitwise_right_shift
- bitwise_xor
- Added math op keras.ops.logdet.
- Added numpy op keras.ops.trunc.
- Added keras.ops.dot_product_attention.
- Added keras.ops.histogram.
- Allow infinite PyDataset instances to use multithreading.
- Added argument verbose in keras.saving.ExportArchive.write_out() method for exporting TF SavedModel.
- Added epsilon argument in keras.ops.normalize.
- Added Model.get_state_tree() method for retrieving a nested dict mapping variable paths to variable values (either as numpy arrays or backend tensors (default)). This is useful for rolling out custom JAX training loops.
- Added image augmentation/preprocessing layers keras.layers.AutoContrast, keras.layers.Solarization.
- Added keras.layers.Pipeline class, to apply a sequence of layers to an input. This class is useful to build a preprocessing pipeline. Compared to a Sequential model, Pipeline features a few important differences:
- It's not a Model, just a plain layer.
- When the layers in the pipeline are compatible with tf.data, the pipeline will also remain tf.data compatible, independently of the backend you use.
Accelerate Training with an Exxact Multi-GPU Workstation
With the latest CPUs and most powerful GPUs available, accelerate your deep learning and AI project optimized to your deployment, budget, and desired performance!
Configure NowTensor Flow 2.18 Release Notes
TensorFlow 2.18 Changes
TensorFlow 2.18 has been released! Highlights of this release (and 2.17) include NumPy 2.0, LiteRT repository, CUDA Update, Hermetic CUDA, and more. For the full release notes, visit the TensorFlow 2.18 Github here.
NumPy 2.0
The upcoming TensorFlow 2.18 release will include support for NumPy 2.0. While most TensorFlow APIs will function seamlessly with NumPy 2.0, this may break some edge cases of usage, e.g., out-of-boundary conversion errors and numpy scalar representation errors. You can consult the following common solutions.
NumPy's type promotion rules have been changed (See NEP 50 for details). This may change the precision at which computations happen, leading either to type errors or to numerical changes to results. Please see the NumPy 2 migration guide.
TensorFlow updated some TensorFlow tensor APIs to maintain compatibility with NumPy 2.0 while preserving the out-of-boundary conversion behavior in NumPy 1.x.
LiteRT Repository
TensorFlow is making changes to how LiteRT (formerly known as TFLite) is developed. Over the coming months, they'll be gradually transitioning TFLite's codebase to LiteRT. Once the migration is complete, they'll start accepting contributions directly through the LiteRT repository. There will no longer be any binary TFLite releases and developers should switch to LiteRT for the latest updates.
Hermetic CUDA
If you build TensorFlow from source, Bazel will now download specific versions of CUDA, CUDNN, and NCCL distributions, and then use those tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions because the build no longer relies on the locally installed versions. More details are provided here.
CUDA Update
TensorFlow binary distributions now ship with dedicated CUDA kernels for GPUs with a compute capability of 8.9. This improves the performance on popular Ada-Generation GPUs like NVIDIA RTX 40**, L4, and L40/L40S.
To keep Python wheel sizes in check, they decided to no longer ship CUDA kernels for compute capability 5.0. That means the oldest NVIDIA GPU generation supported by the precompiled Python packages is now the Pascal generation (compute capability 6.0). For Maxwell support, they either recommend sticking with TensorFlow version 2.16, or compiling TensorFlow from source. The latter will be possible as long as the used CUDA version still supports Maxwell GPUs.
TensorFlow 2.18 Other Changes & Bugfixes
Breaking Changes
- tf.lite
- Interpreter:
- tf.lite.Interpreter gives warning of future deletion and a redirection notice to its new location at ai_edge_litert.interpreter. See the migration guide for details.
- C API:
- An optional, fourth parameter was added TfLiteOperatorCreate as a step forward toward a cleaner API for TfLiteOperator. Function TfLiteOperatorCreate was added recently, in TensorFlow Lite version 2.17.0, released on 7/11/2024, and we do not expect there will be much code using this function yet. Any code breakages can be easily resolved by passing nullptr as the new, 4th parameter.
- Interpreter:
- TensorRT support is disabled in CUDA builds for code health improvement.
- Hermetic CUDA support is added.
Hermetic CUDA uses a specific downloadable version of CUDA instead of the user's locally installed CUDA. Bazel will download CUDA, CUDNN and NCCL distributions, and then use CUDA libraries and tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions.
Bug Fixes and Other Changes
- tf.data
- Add optional synchronous argument to map, to specify that the map should run synchronously, as opposed to be parallelizable when options.experimental_optimization.map_parallelization=True. This saves memory compared to setting num_parallel_calls=1.
- Add optional use_unbounded_threadpool argument to map, to specify that the map should use an unbounded threadpool instead of the default pool that is based on the number of cores on the machine. This can improve throughput for map functions which perform IO or otherwise release the CPU.
- Add tf.data.experimental.get_model_proto to allow users to peek into the analytical model inside of a dataset iterator.
- tf.lite
- Dequantize op supports TensorType_INT4.
- This change includes per-channel dequantization.
- Add support for stablehlo.composite.
- EmbeddingLookup op supports per-channel quantization and TensorType_INT4 values.
- FullyConnected op supports TensorType_INT16 activation and TensorType_Int4 weight per-channel quantization.
- Dequantize op supports TensorType_INT4.
- tf.tensor_scatter_update, tf.tensor_scatter_add and of other reduce types.
- Support bad_indices_policy.
Keras 3.6 Changes
Highlights
- New file editor utility: keras.saving.KerasFileEditor. Use it to inspect, diff, modify and resave Keras weights files. See basic workflow here.
- New keras.utils.Config class for managing experiment config parameters.
Breaking Changes
- When using keras.utils.get_file, with extract=True or untar=True, the return value will be the path of the extracted directory, rather than the path of the archive.
Other Changes and Additions
- Logging is now asynchronous in fit(), evaluate(), predict(). This enables 100% compact stacking of train_step calls on accelerators (e.g. when running small models on TPU).
- If you are using custom callbacks that rely on on_batch_end, this will disable async logging. You can force it back by adding self.async_safe = True to your callbacks. Note that the TensorBoard callback isn't considered async safe by default. Default callbacks like the progress bar are async safe.
- Added keras.saving.KerasFileEditor utility to inspect, diff, modify and resave Keras weights file.
- Added keras.utils.Config class. It behaves like a dictionary, with a few nice features:
- All entries are accessible and settable as attributes, in addition to dict-style (e.g. config.foo = 2 or config["foo"] are both valid)
- You can easily serialize it to JSON via config.to_json().
- You can easily freeze it, preventing future changes, via config.freeze().
- Added bitwise numpy ops:
- bitwise_and
- bitwise_invert
- bitwise_left_shift
- bitwise_not
- bitwise_or
- bitwise_right_shift
- bitwise_xor
- Added math op keras.ops.logdet.
- Added numpy op keras.ops.trunc.
- Added keras.ops.dot_product_attention.
- Added keras.ops.histogram.
- Allow infinite PyDataset instances to use multithreading.
- Added argument verbose in keras.saving.ExportArchive.write_out() method for exporting TF SavedModel.
- Added epsilon argument in keras.ops.normalize.
- Added Model.get_state_tree() method for retrieving a nested dict mapping variable paths to variable values (either as numpy arrays or backend tensors (default)). This is useful for rolling out custom JAX training loops.
- Added image augmentation/preprocessing layers keras.layers.AutoContrast, keras.layers.Solarization.
- Added keras.layers.Pipeline class, to apply a sequence of layers to an input. This class is useful to build a preprocessing pipeline. Compared to a Sequential model, Pipeline features a few important differences:
- It's not a Model, just a plain layer.
- When the layers in the pipeline are compatible with tf.data, the pipeline will also remain tf.data compatible, independently of the backend you use.
Accelerate Training with an Exxact Multi-GPU Workstation
With the latest CPUs and most powerful GPUs available, accelerate your deep learning and AI project optimized to your deployment, budget, and desired performance!
Configure Now