Best 17 DataOps Engineers

dataops engineers - Best 17 DataOps Engineers

DataOps – applying DevOps principles to data engineering – has emerged as a critical discipline ensuring reliable, agile data pipelines for modern analytics and AI. The field’s top practitioners include visionary open-source contributors, startup founders who still code, influential bloggers, big-tech engineers, and even coding competition champions.

Below is a globally curated list of leading DataOps engineers selected for their impactful tools, thought leadership, community contributions, and innovative achievements:

  1. Jeremiah Lowin
  2. Arvind Prabhakar
  3. Tyler Akidau
  4. Joe Witt
  5. Michel Tricot
  6. Tristan Handy
  7. George Fraser
  8. Douwe Maan
  9. Bruno Aziza
  10. Nick Schrock
  11. Christopher Bergh
  12. Chad Sanderson
  13. Erik Bernhardsson
  14. Ben Rogojan
  15. Andreas Kretz
  16. Cassie Kozyrkov
  17. Matt Casters

Now, let’s dive into each of these experts’ profiles:

Jeremiah Lowin

YouTube Video

Nationality: American

Jeremiah is the founder and CEO of Prefect, an open-source DataOps orchestration tool often dubbed the “next-generation Airflow”. A former quant finance engineer, he became a committer on Airflow and in 2018 set out to address its limitations by creating Prefect.

Prefect introduced a hybrid flow model and focus on data pipeline reliability, attracting a robust open-source community. Under Lowin’s leadership, Prefect has run millions of pipelines and raised over $30M, proving his vision of a new “dataflow automation” paradigm. Unusually for a CEO, Jeremiah remains a hands-on coder in the Prefect core.

He also shares insights on building healthy OSS communities, making him both a technical and community influencer in the ETL/DataOps space.

Arvind Prabhakar

Nationality: Indian

Arvind is an open-source luminary in data integration, known for creating Apache Flume and co-founding StreamSets, a prominent DataOps platform. As a software architect at Cloudera, he authored Flume in 2011 to collect and move log data at scale, and served as its Apache PMC chair.

In 2014, he teamed with Girish Pancha to launch StreamSets, aiming to provide continuous, “always-on” data pipelines with observability. As CTO at StreamSets (acquired by Software AG in 2022), Arvind spent years driving development of its smart data pipeline platform. He’s also contributed to Apache Sqoop and Storm.

Across a decade of work – from durable log collection (Flume) to modern data pipeline automation (StreamSets) – Prabhakar has consistently pushed the envelope in ETL system design, earning him recognition among top DataOps engineers.

Tyler Akidau

Tyler Akidau - Best 17 DataOps Engineers

Nationality: American

Tyler is a pioneer of streaming DataOps. At Google, he spent a decade designing massive-scale data processing systems and was the lead engineer behind the Apache Beam model and Google Cloud Dataflow service. He co-created Google’s internal MillWheel stream engine and advocated the philosophy that batch and streaming are “two sides of the same coin.”

Akidau’s influential articles “Streaming 101” and “Streaming 102” educated tens of thousands of engineers on modern stream processing. He also co-authored the book Streaming Systems. Now an engineer at Snowflake, Tyler continues to shape streaming ETL best practices. His work unifying batch/stream paradigms in Beam has fundamentally advanced real-time DataOps, making him a thought leader in streaming analytics.

Joe Witt

Nationality: American

Joe is the co-creator of Apache NiFi, a powerful open-source platform for real-time data ingestion and ETL that originated at the U.S. NSA. He spent a decade in government research developing “Niagara Files”, which was open-sourced as NiFi in 2014. In 2015 Joe co-founded Onyara, a startup to productize NiFi, which was quickly acquired by Hortonworks as NiFi gained industry traction.

Joe continued to steward NiFi’s growth as an engineering leader at Hortonworks/Cloudera, guiding it to become a top solution for building secure, high-throughput data flows (from IoT sensor streams to enterprise pipelines). He remains an Apache NiFi PMC member and is now applying his dataflow expertise as co-founder of a new venture (Datavolo).

Witt’s journey from intel agencies to open source highlights his unique influence on DataOps for streaming and edge data.

Michel Tricot

Nationality: French

We want to make data integration as easy and modular as possible, so companies can focus on what matters: using their data to make decisions.

Michel is the co-founder and CEO of Airbyte, the open-source data integration (ELT) platform that has rapidly become a standard for syncing data from APIs and databases into data warehouses. With 15+ years in data engineering (previously leading data ingestion at LiveRamp), Michel saw the need for an open, community-driven approach to ETL connectors.

He launched Airbyte in 2020 to “commoditize data integration”, providing a framework of modular connectors maintained by the community. The project’s popularity exploded – Airbyte raised over $180M and built a contributor base around its Singer-compatible connector protocol. Under Tricot’s technical leadership, Airbyte is now used by thousands of companies to move data.

His vision of an open, extensible ELT ecosystem has made data integration more accessible, cementing Michel as a top innovator in DataOps pipelines.

Tristan Handy

Tristan Handy - Best 17 DataOps Engineers

Nationality: American

Tristan is the founder and CEO of dbt Labs, creator of dbt (data build tool) – an open-source framework that has redefined data transformation in the modern analytics stack. He launched dbt in 2016 to enable data analysts to own the “T” in ELT using software engineering best practices (modular SQL, version control).

Under Tristan’s leadership, dbt grew from a small open-source project to a global movement with over 50,000 users and a $4.2B-valued company. He popularized the term “analytics engineering” and fostered an active community via the dbt blog, newsletter, and podcast. With two decades in data, Handy’s vision of bringing agile development to data pipelines has had a profound impact on DataOps workflows, empowering analysts to produce production-grade data pipelines.

George Fraser

Nationality: American

George is the co-founder and CEO of Fivetran, a pioneer in fully managed “ELT as a service”. A former neuroscientist turned entrepreneur, George started Fivetran in 2012 (with co-founder Taylor Brown) after observing how cumbersome hand-built ETL pipelines were. He led Fivetran through Y Combinator and over the next decade built it into a global company valued over $5 billion.

Fivetran’s platform automatically extracts and loads data from dozens of sources (from SaaS apps to databases), enabling analytics teams to have reliable data pipelines without maintenance. Fraser remains deeply involved in product strategy and has written about data engineering best practices (e.g., on TechCrunch).

By popularizing the idea of “pipelines as a managed service”, George has streamlined DataOps for thousands of organizations, making him a prominent figure in the data integration domain.

Douwe Maan

Nationality: Dutch

Douwe is the founder and CEO of Meltano, an open-source DataOps platform that brings together data integration and transformation. As employee #10 at GitLab, he started Meltano as an internal project and then spun it out as an independent startup in 2021 (with backing from Alphabet’s GV).

Meltano provides an end-to-end framework that combines Singer taps (for extraction) with dbt (for transformation), reflecting Douwe’s vision of a modular, “single platform” for the data lifecycle. During his years at GitLab, Maan was a developer and engineering manager, which prepared him to lead Meltano’s community-driven development. Under his leadership, Meltano has gained traction among data engineers seeking a flexible alternative to proprietary ELT tools.

Douwe’s journey – leaving a high-growth company to follow his passion for data engineering – has made him an influential voice in the open-source DataOps community.

Bruno Aziza

Bruno Aziza - Best 17 DataOps Engineers

Nationality: French (U.S.-based)

Bruno is a well-known thought leader in data analytics and business intelligence, currently the Head of Data & Analytics at Google Cloud. He has a rich history of leading product and marketing teams at tech companies big and small – from Oracle and Microsoft (where he was Worldwide BI Strategy Lead) to startup AtScale (where he was CMO) – giving him a 360° perspective on the data tools landscape.

At Google, Bruno focuses on helping enterprises adopt modern data analytics solutions, and he frequently shares industry insights via webinars, podcasts, and social media. He has co-authored two books on data analytics and performance management. With his energetic style, Bruno is a familiar face at conferences, championing topics like data democratization and data culture.

His “extensive background in data” and ability to forecast trends have made him an influential voice in guiding DataOps and analytics strategies for many organizations.

Nick Schrock

Nationality: American

Nick is the founder of Dagster, an open-source data orchestrator that brings software engineering rigor to DataOps. A former Facebook engineer, Nick was one of the co-creators of GraphQL in 2012 and helped build the tooling behind products like the Facebook News Feed. Drawing on that experience, he launched Dagster (through his company Elementl) to improve how data pipelines are built and observed.

Dagster introduces a novel type-safe, testable approach to defining ETL workflows, treating pipelines as modular software components rather than black boxes. Under Schrock’s hands-on leadership (he codes and designs Dagster’s core), the tool has gained a devoted following among data engineers for its developer-friendly model.

Nick’s unique background in developer frameworks (GraphQL, React) and data orchestration makes him a thought leader in bridging best practices from traditional software into the DataOps realm.

Christopher Bergh

Nationality: American

Chris is often regarded as the father of DataOps as a formal methodology. He co-founded DataKitchen in 2013 and as CEO (dubbed “Head Chef”) has spent the last decade evangelizing DataOps practices to the industry. Bergh is the co-author of the “DataOps Cookbook” and the “DataOps Manifesto”, seminal works which codified the principles of agile, continuous improvement for data analytics.

With 25+ years in software and data (including early work at MIT Lincoln Lab and as a CTO/VP in multiple companies), Chris brought lessons from DevOps and manufacturing (lean/TQM) into the data sphere. He pioneered techniques like pipeline versioning, automated testing for data flows, and “analytics pipelines” assembly lines. Today, DataKitchen’s platform embodies many of these ideas, and Bergh remains a sought-after speaker on DataOps at conferences.

By turning a philosophy into a movement, Chris has helped countless teams reduce data errors and speed up delivery, solidifying his status as a DataOps thought leader.

Chad Sanderson

Chad Sanderson 1 - Best 17 DataOps Engineers

Nationality: American

Chad is a rising star in the data engineering community, known for his advocacy of Data Quality and Data Contracts. He is the CEO of Gable.ai, a startup (founded 2023) that offers a “shift-left” data management platform to enforce data quality in software development. Prior to founding Gable, Chad led data platform teams at companies like Convoy (head of data), Microsoft, Sephora, Subway, and Oracle.

These experiences gave him firsthand insight into the pain of broken pipelines and the need for better collaboration between software engineers and data teams. Sanderson authors the popular “Data Products” newsletter and is writing an O’Reilly book on Data Contracts – an approach he pioneered to formalize agreements between data producers and consumers, thereby preventing downstream data issues. His outspoken LinkedIn posts and talks about data trust, modeling, and treating data as a product have attracted a large following.

By injecting product and engineering discipline into DataOps, Chad has become an influential voice shaping the future of data engineering culture.

Erik Bernhardsson

Nationality: Swedish

Erik created Luigi, one of the first open-source Python frameworks for orchestrating complex batch data pipelines, which he developed at Spotify in 2012 to automate the company’s music recommendation jobs. Luigi (10K+ GitHub stars) influenced later workflow tools like Airflow and set early standards for DAG-based pipeline schedulers.

Beyond Luigi, Erik open-sourced Annoy and built Spotify’s original music recommendation system. He later served as CTO at Better.com and in 2022 founded Modal, a serverless cloud platform for data applications. Impressively, Bernhardsson is also a former gold medalist in the International Olympiad in Informatics, underscoring his algorithmic excellence.

Through his widely read blog posts on data engineering and culture, and his advocacy of functional data engineering, Erik has significantly shaped how practitioners approach DataOps and ETL pipeline design.

Ben Rogojan

Nationality: American

Ben, better known as the Seattle Data Guy, is a data engineering influencer who has made a name through content and consulting. An ex-Meta (Facebook) data engineer, he now runs a data infrastructure consulting firm and produces popular educational content on YouTube and Substack.

Ben’s approachable videos and articles break down data engineering concepts, cloud ETL solutions, and career advice, garnering him over 100k followers on both YouTube and LinkedIn. He exemplifies the “practitioner-turned-educator” trend in DataOps – having built pipelines in industry, he now helps a broad audience learn those skills. Rogojan frequently discusses modern data stack tools, data architecture trade-offs, and real-world pipeline war stories, making formerly esoteric topics accessible to aspiring engineers.

By combining hands-on expertise with a knack for communication, Ben has become one of the most visible online personalities in DataOps and data engineering.

Andreas Kretz

Andreas Kretz - Best 17 DataOps Engineers

Nationality: German

Andreas is known as the “Plumber of Data Science”, an educator and influencer helping thousands launch careers in data engineering. Based in Germany, he built a large following through his YouTube channel and blog “Learn Data Engineering”, where he demystifies data pipelines, workflows, and infrastructure.

Andreas’s content focuses on the practical “how-to” of DataOps – from building your first data pipeline to deploying Kafka clusters – all explained in an approachable way. He has over 170k YouTube subscribers and 120k LinkedIn followers, reflecting his status as a go-to mentor for data engineering newcomers. Formerly a data engineer in industry, Kretz shares insights drawn from real projects, and often live-streams his coding of pipeline projects.

By equating data engineers to modern-day plumbers (who connect and maintain the flow of data), Andreas has popularized the craft of DataOps to a global audience. His enthusiasm and clarity have inspired many to join the field.

Cassie Kozyrkov

Nationality: South African

Cassie is an influential data scientist and evangelist who pioneered the field of Decision Intelligence at Google. As Google’s first Chief Decision Scientist, she trained over 20,000 Googlers in how to make better decisions with data. Cassie is known for demystifying data science and bridging the gap between data teams and business leadership – a perspective highly relevant to DataOps success.

Her vivid analogies and clear explanations (often shared on her blog and Medium) have made her a top voice in the data community. Kozyrkov emphasizes practical, impact-focused data analytics rather than “analysis for its own sake”, urging companies to focus on decision-making processes. She’s also a major advocate for diversity in tech.

With a huge following on LinkedIn and Twitter, Cassie’s thought leadership helps ensure the output of DataOps (data products) truly drives intelligent business outcomes.

Matt Casters

Nationality: Belgian

Matt is the original creator of Pentaho Kettle (a.k.a. PDI), one of the earliest open-source ETL tools, and more recently a co-founder of Apache Hop (the evolution of Kettle for modern workloads). He began developing Kettle in 2001 and open-sourced it in 2005, providing the data community with a free, user-friendly alternative to expensive ETL software.

After Pentaho acquired Kettle, Matt served as Chief Architect of Data Integration, guiding the tool’s growth into an enterprise-grade suite used by thousands of organizations. In 2020, he helped initiate Apache Hop to rebuild Kettle’s concepts on new technology, continuing his two-decade mission of making ETL accessible. Casters also spent time at Neo4j as a solutions architect, extending ETL principles to graph databases.

Matt’s enduring impact – from pioneering open-source ETL to mentoring a new generation through Apache Hop – secures his place among the DataOps elite.

Wrap Up

These legends represent exceptional talent, making them extremely challenging to headhunt. However, there are thousands of other highly skilled IT professionals available to hire with our help. Contact us, and we will be happy to discuss your hiring needs.

Note: We’ve dedicated significant time and effort to creating and verifying this curated list of top talent. However, if you believe a correction or addition is needed, feel free to reach out. We’ll gladly review and update the page.

Frequently Asked Questions

What is the difference between DataOps and Data engineer?

In short, data engineers build and manage data systems, while DataOps specialists optimize and streamline the processes around these systems.

Ready to get started?