It has been four years since Kedro was released as an open-source project!
This anniversary is a perfect opportunity to reflect on our journey, milestones, and impact. Join us as we celebrate!
Celebrating the journey
In June 2019, we released Kedro as an open-source framework to enable data scientists, data engineers and machine-learning engineers worldwide to create maintainable, modular, and reproducible code.
Kedro resulted from two years of work by a multi-disciplinary team of colleagues and one of the first open-source projects from QuantumBlack Labs, a McKinsey company and a centre of learning dedicated to driving innovation and experimentation in AI.
Before our open-source launch, we encountered the challenge of choosing a meaningful name for the framework. We wanted one to reflect our values without infringing upon any existing trademarked product names. It took a surprising amount of time and discussion.
We also faced an unexpected challenge to ensure accurate interpretation of open-source licenses. We discovered that a library used behind our command-line interface was licensed under LGPL, which conflicted with our decision to adopt the Apache 2.0 license, and this took time and discussion to resolve. Unfortunately, at that time, we did not have the advantage of using a standard tool to check on the “dependencies of dependencies”. We strongly recommend that other teams embarking on a similar journey allocate time to address potential licensing issues.
Finally, as we prepared to release as an open-source project, we set out to consolidate all available documentation into a user-friendly resource. We needed to spend some time organising it for our typical users, and even now, we are still reviewing and revising how we help people learn about Kedro.
Key Kedro milestones
Some of the key Kedro releases include the following:
Kedro was released as version 0.14.0 in June 2019, while Kedro-Viz 1.0.0 was released later the same month. Both hit the ground running with releases to add features incrementally.
In February 2020, Kedro 0.15.6 revamped Kedro’s datasets, decorators and dataset transformers to use fsspec to access a variety of data stores, including local file systems, network file systems, cloud object stores (including S3 and GCP), and Hadoop.
In May 2020, Kedro 0.16.0 added Hooks.
In December 2020, we made it possible to use the Data Catalog as an individual component in Kedro 0.17.0 and focussed on decoupling the framework components. In the same month, we introduced a new graph layout engine to Kedro-Viz.
The Kedro team partnered with Astronomer to release Kedro-Airflow 0.4.0 in March 2021.
“Kedro does an outstanding job of allowing data scientists to apply good software engineering principles to their code and make it modular, but Kedro pipelines need a separate scheduling and execution environment to run at scale. Given this need, there was a natural bond between Kedro pipeline and Airflow: we wanted to do everything we could to build a great developer experience at the intersection of the two tools.”
Pete DeJoy, one of the founding members of Astronomer.
In November 2021, Kedro-Viz 4.1.0 added experiment tracking capabilities.
In March 2022, Kedro 0.18.0 added support for Python 3.9 and 3.10, introduced a micro-packaging workflow and streamlined integration with IPython and Jupyter. In the same month, Kedro-Viz 4.6.0 added support for Plotly chart visualisation.
In May 2023, Kedro-Viz 6.2.0 extended experiment tracking to add collaborative features with data sharing for experiments.
Some Kedro highlights
Following the open-source launch of Kedro and Kedro-Viz in June 2019, here are some of the other highlights:
In 2020, we focussed on engineering, making a series of releases as described above, and worked to grow the community of “Kedroids”.
We continued community building in 2021, adding discussion features and telemetry to support our users and better inform ourselves of how they were using and what they needed from Kedro.
We announced our adoption by the Linux Foundation (AI & Data) in January 2022.
We launched the Kedro website in May 2022.
Kedro passed 8000 stars on GitHub in January 2023.
The Kedro blog launched in March 2023, with a first post about experiment tracking, which had recently been added to Kedro-Viz.
Kedro's new branding was unveiled in June 2023, with changes to the identity that align it with our vision of simplicity and collaboration.
Celebrating the Kedroid community
When we evaluated the release of Kedro as open source, we believed in the culture of collaboration, transparency, and community-driven development associated with this approach. In 2020, Lais Carvalho, our community manager, helped to build an enthusiastic community of Kedroids worldwide, with help from contributions from across the community, such as DataEngineerOne’s videos and Waylon Walker’s blog posts and plugin contributions.
As the worldwide pandemic challenged the pace and nature of work, we saw the Kedro community and the team at QuantumBlack Labs nurture the project through a dynamic and challenging period.
We anticipated an increase in innovation, creativity, and knowledge sharing, but in the four years since we released Kedro as open-source, our community has outshone our expectations by creating and sharing Kedro plugins, datasets and articles and tutorials about how they use Kedro.
We have encouraged community participation from the outset, and the recent addition of Juan-Luis Cano as our permanent developer advocate to the Kedro team means that we are now in a position even better support Kedro users and contributors. We have big plans for Kedro training, documentation, features and community outreach as a team.
Looking ahead
We are excited that Kedro will sit at the heart of the QuantumBlack Horizon, an integrated and flexible suite of AI development tools announced earlier in June 2023. QuantumBlack Horizon is a product suite that delivers value through an industrialised and cohesive production system, tech stack, and operating model. The suite achieves four key objectives:
clean, organised, and accurate data across internal and external sources;
scalable, repeatable AI models that build on each other;
a factory-like approach to model development and monitoring;
performance transparency that enables quick, reliable decision-making.
Kedro's new brand will scale to become the look and feel for QuantumBlack Horizon too.
As we commemorate the anniversary of our open-source journey, we express our gratitude to the community, contributors, and users who have supported and shaped our software. The spirit of openness has empowered us to reach new heights, foster innovation, and create a lasting impact.
Together, let us embrace the future and strive to make a positive impact during this dynamic and exciting period in the field of data science and machine learning. In the Kedro team, we will continue to champion openness, collaboration, and the limitless potential of open source. We look forward to meeting our user community online during regular training and update meetings and on Slack.
Find out more about Kedro
There are many ways to learn more about Kedro:
Join our Slack organisation to reach out to us directly if you’ve a question or want to stay up to date with news. There's an archive of past conversations on Slack too.
Read our documentation or take a look at the Kedro source code on GitHub.
Check out our video course on YouTube.