The workflow automation platform for complex, mission-critical data and ML processes at scale

Flyte makes it easy to create concurrent, scalable, and maintainable workflows for machine learning and data processing.
Flyte is used in production at Lyft, Spotify, Freenome and others. At Lyft, Flyte has been serving production model training and data processing for over four years, becoming the de-facto platform for teams like Pricing, Locations, ETA, Mapping, Autonomous, and more. In fact, Flyte manages over 10,000 unique workflows at Lyft, totaling over 1,000,000 executions every month, 20 million tasks, and 40 million containers.

Hosted, multi-tenant, and serverless

Flyte frees you from wrangling infrastructure, allowing you to concentrate on business problems rather than machines. As a multi-tenant service, you work in your own, isolated repo and deploy and scale without affecting the rest of the platform. Your code is versioned, containerized with its dependencies, and every execution is reproducible.

To provide this level of isolation, we’re built directly on Kubernetes and get all the benefits containerization provides: portability, scalability, reliability, and more.

Elastic Scale

Flyte was born out of the need to scale to multiple tenants, yet be intuitive and simple to use. It is built from the ground up to be a distributed, fault-tolerant control plane, with no single point of failure. It can scale to multiple clusters, thousands of nodes, and thousands of concurrent workflows.

Flyte is used in production at Lyft, Spotify, Freenome and others. At Lyft, Flyte has been serving production model training and data processing for over four years, becoming the de-facto platform for teams like Pricing , Locations, ETA, Mapping, Autonomous, and more. In fact, Flyte manages over 10,000 unique workflows at Lyft, totaling over 1,000,000 executions every month, 20 million tasks, and 40 million containers.

Parameters, Data Lineage, and Caching

All Flyte tasks and workflows have strongly typed inputs and outputs. This makes it possible to parameterize your workflows, have rich data lineage, and use cached versions of pre-computed artifacts. If, for example, you’re doing hyperparameter optimization, you can easily invoke different parameters with each run. Additionally, if the run invokes a task that has already been computed before, regardless of who executed it, Flyte will smartly use the cached output, saving you both time and money.

Versioned, Reproducible, and Shareable

Every entity in Flyte is immutable, with every change explicitly captured as a new version. This makes it easy and efficient for you to iterate, experiment and rollback your workflows. Furthermore, Flyte enables you to share these versioned tasks across workflows, speeding up your dev cycle by avoiding repetitive work across individuals and teams.

Dynamic and extensible

Flyte is framework-agnostic and has a growing collection of plugins to assist with all of your workflow needs, including Spark on K8s, AWS Batch, Array Jobs, Hive Qubole, Containers, Pods, AWS Sagemaker, AWS Athena, Pandera data correctness and more. It’s easy to contribute a plugin, too! Extend Flyte.

Flyte is designed to be polyglot and this has great advantages. Flyte ships with a beautifully implemented SDKs in - Python and Java & Scala. Say Hi! to the community in Slack or create an issue to brainstorm ideas for other languages or plugins.