This highlights the main OSS efforts for the TFX team in 2019 and H1 2020. If you're interested in contributing in one of these areas, contributions are always welcome, especially in areas that extend TFX into infrastructure currently not widely in use at Google.
- Democratize access to machine learning (ML) best practices, tools, and code.
- Enable users to easily run production ML pipelines on public clouds, on premises, and in heterogeneous computing environments.
- Help enterprises realize large-scale production ML capabilities similar to what we have available at Google. We recognize that every enterprise has unique infrastructure challenges, and we want TFX to be open and adaptable to those challenges.
- Stimulate innovation: Machine learning is a rapid, innovative field and we want TFX to help researchers and engineers both realize and contribute to that innovation. Likewise, we want TFX to be interoperable with other ML efforts in the open source community.
- Usability: We want the journey to deploy a model in production to be as frictionless as possible throughout the entire journey -- from the initial efforts building a model to the final touches of deploying in production.
- Encourage the discovery and reuse of external contributions.
- Participate in and extend support for other OSS efforts, initially: Apache Beam, ML Metadata, Kubeflow, Tensorboard, and TensorFlow 2.0.
- Extend portability across additional cluster computing frameworks, orchestrators, and data representations.
- Better distributed training support (DistributionStrategy).
- Better telemetry for users to understand the behavior of components in a TFX pipeline.
- Support of TensorFlow 2.0 in two phases:
- The first phase will provide the following:
Existing TFX pipelines can continue to use TensorFlow 1.X. To switch to
TensorFlow 2.X, see the TensorFlow migration guide.
New TFX pipelines should use Keras (via
tf.keras.estimator.model_to_estimator()
) and TensorFlow 2.X. - The second phase will enable the remainder of TensorFlow 2.X functionality, including tf.distribute and Keras without Estimator.
- The first phase will provide the following:
Existing TFX pipelines can continue to use TensorFlow 1.X. To switch to
TensorFlow 2.X, see the TensorFlow migration guide.
New TFX pipelines should use Keras (via
- Integration with TensorBoard and TF Hub/AI Hub.
- Improving the testing capabilities for OSS developers.
- Increased interoperability with Kubeflow Pipelines.
- Support for training on continuously arriving data.
- More pipeline code examples, including DIY orchestrators and custom components.
- Formalize Special Interest Groups (SIGs) for specific aspects of TFX to accelerate community innovation and collaboration.
- Early access to new features.
- Q3 2019: Support for local orchestrator through Apache Beam.
- Q3 2019: Experimental support for interactive development on Jupyter notebook.
- Q3 2019: Experimental support for TFX CLI released.
- Q3 2019: Multiple public RFCs published to the tensorflow/community project.
- Q2 2019: Support for Python3.
- Q2 2019: Apache Spark and Apache Flink runners (with examples).
- Q2 2019L Custom executors (with examples).
- Q1 2019: TFX end-to-end pipeline, config, and orchestration initial release.
- Q1 2019: ml.metadata initial release.
- Q3 2018: TensorFlow Data Validation initial release.
- Q1 2018: TensorFlow Model Analysis initial release.
- Q1 2017: TensorFlow Transform initial release.
- Q1 2016: TensorFlow Serving initial release.