Kedro

Maintainable
data science solved

Kedro is an open sourced Python framework for creating maintainable and modular data science code.

Why Kedro?

Machine Learning Engineering
Handles Complexity
Standardisation
Production-Ready

Puts the "engineering" back into data science because it borrows concepts from software engineering and applies them to machine-learning code. It is the foundation for clean, data science code.

Features

Pipeline Visualisation

Kedro's pipeline visualisation plugin shows a blueprint of your developing data and machine-learning workflows, provides data lineage, keeps track of machine-learning experiments and makes it easier to collaborate with business stakeholders.

Data Catalog

A series of lightweight data connectors used to save and load data across many different file formats and file systems. Supported file formats include Pandas, Spark, Dask, NetworkX, Pickle, Plotly, Matplotlib and many more. The Data Catalog supports S3, GCP, Azure, sFTP, DBFS and local filesystems. The Data Catalog also includes data and model snapshots for file-based systems.

Integrations

Apache Spark, Pandas, Dask, Matplotlib, Plotly, fsspec, Apache Airflow, Jupyter Notebook and Docker.

Project Template

You can standardise how configuration, source code, tests, documentation, and notebooks are organised with an adaptable, easy-to-use project template. Create your cookie cutter project templates with Starters.

FAQs

You can find the Kedro community on Slack.

We also maintain a list of extensions, plugins, articles, podcasts, talks, and Kedro showcase projects in the awesome-kedro repository.

Expand all

What is Kedro?

Kedro is an open-source Python framework hosted by the Linux Foundation (LF AI & Data). Kedro standardises how data science code is created to ensure it is reproducible, maintainable, and modular; it uses software engineering best practices to help you build production-ready data science code.

What does Kedro do?

Is Kedro an orchestrator?

I'm a data scientist. Why should I use Kedro?

I'm a Machine-Learning Engineer/Data Engineer. Why should I be interested in Kedro?

I'm a Product Lead, and my team wants to use Kedro. Why?

What's Kedro's origin story?

How can I find out more about Kedro?

Case Studies

Kedro in Production at Telksomsel

Learn how Kedro is used in production at Telkomsel, Indonesia's largest telecommunications company. Kedro is used to help consume tens of TBs of data, run hundreds of feature engineering tasks, and serve dozens of ML models.

Creating Robust ML Products at Beamery

Data scientists at Beamery, a fast-growing talent lifecycle management company, explain how Kedro helps them write "production-code". They talk about a workflow that involves Kedro when they want to progress their POCs.

Our community

Testimonials

Eduardo Ohe, Principal Data Engineer

Tremendously valuable

Kedro has streamlined our workflow process, avoiding a lot of back and forth with debugging. It allowed our company to deliver more value to our customers quickly.

Ready to start?

You are ready to get going with the Kedro workflow. But first, head to our documentation to learn how to install Kedro and then get up to speed with concepts like nodes, pipelines, the data catalog in our introductory tutorial.

Kedro

Maintainable
data science solved

Kedro is an open sourced Python framework for creating maintainable and modular data science code.

Why Kedro?

Features

Pipeline Visualisation

Data Catalog

Integrations

Project Template

Pipeline Abstraction

Coding Standards

Flexible Deployment

Experiment Tracking

FAQs

What is Kedro?

What does Kedro do?

Is Kedro an orchestrator?

I'm a data scientist. Why should I use Kedro?

I'm a Machine-Learning Engineer/Data Engineer. Why should I be interested in Kedro?

I'm a Product Lead, and my team wants to use Kedro. Why?

What's Kedro's origin story?

How can I find out more about Kedro?

Case Studies

Kedro in Production at Telksomsel

Creating Robust ML Products at Beamery

Testimonials

Tremendously valuable

Ready to start?

Kedro

Kedro

Maintainabledata science solved

Kedro is an open sourced Python framework for creating maintainable and modular data science code.

Why Kedro?

Features

Pipeline Visualisation

Data Catalog

Integrations

Project Template

Pipeline Abstraction

Coding Standards

Flexible Deployment

Experiment Tracking

FAQs

What is Kedro?

What does Kedro do?

Is Kedro an orchestrator?

I'm a data scientist. Why should I use Kedro?

I'm a Machine-Learning Engineer/Data Engineer. Why should I be interested in Kedro?

I'm a Product Lead, and my team wants to use Kedro. Why?

What's Kedro's origin story?

How can I find out more about Kedro?

Case Studies

Kedro in Production at Telksomsel

Creating Robust ML Products at Beamery

Testimonials

Tremendously valuable

Ready to start?

Kedro

Maintainable
data science solved