Is Big Data growing or declining?

Big Data is growing. Organizations continue to invest heavily in analytics, machine learning, and real-time processing to improve decision-making and efficiency.

What is the hourly rate for Big Data consultants?

Rates depend on region and expertise. In the US and Western Europe, consultants often charge $80–$150 per hour. In Eastern Europe or Latin America, rates usually range from $40–$80 per hour.

What do Big Data consultants do?

They design and implement data pipelines, set up data lakes and warehouses, optimize storage and processing, and help organizations extract insights using analytics and visualization tools.

Is it hard to find Big Data programmers?

Yes, skilled specialists are in high demand. Expertise in tools like Hadoop, Spark, Kafka, and cloud platforms makes these professionals harder to source than general developers.

What companies use Big Data?

Companies such as Amazon, Netflix, Uber, Facebook, and banks worldwide rely on Big Data to manage operations, personalize services, and forecast trends.

Big Data Experts: Top 17 to Hire (2026)

In the fast-evolving world of big data, a select group of industry professionals stand out for their impactful contributions.

Below we present leading big data experts active in the last years, each excelling through open-source innovation, startup leadership, influential blogging or speaking, major roles at tech companies, or prize-winning competition performances. These profiles highlight why they’re at the forefront of big data and include links to their public profiles for further insight:

Hilary Mason
Matei Zaharia
Jay Kreps
Neha Narkhede
Doug Cutting
D.J. Patil
Jeremy Howard
Wes McKinney
Shay Banon
Maxime Beauchemin
Kirk Borne
Matthew Rocklin
Zhamak Dehghani
Abhishek Thakur
Sidhant Dorge
Ali Ghodsi
Jeff Dean

Now, let’s delve into each expert’s background and why they are notable.

Hilary Mason

What Happens to Data Science in the Age of AI?

Nationality: American

Hilary is a leading voice in data science and big data analytics.

She was Chief Scientist at Bitly, where she applied machine learning to understand internet attention patterns in real-time. In 2014, Hilary co-founded Fast Forward Labs, an R&D startup focused on emerging AI and data technologies, later acquired by Cloudera. As CEO of Fast Forward Labs, she led research on practical machine learning innovations. Hilary is also known for her influential writing and speaking: she’s blogged insights on data strategy, appeared in publications like Fast Company and Scientific American, and has been honored in Forbes 40 under 40 in tech.

A prominent advocate for the data science community (co-founder of HackNY, DataGotham), Hilary Mason has helped shape how businesses realize value from big data through intelligent algorithms and intuitive storytelling.

Linkedin: Hilary Mason
X (Twitter): @hmason
Website: hilarymason.com

Matei Zaharia

Nationality: Romanian

Matei is the original creator of Apache Spark, a landmark open-source engine for large-scale data processing.

He co-founded Databricks to commercialize Spark, serving as its CTO while maintaining an active role in its development and related projects like MLflow and Delta Lake. An accomplished computer scientist (associate professor at UC Berkeley), Matei has received awards like the ACM Dissertation Award for his work on Spark. Under his leadership, Databricks has grown into one of the world’s most valuable big data companies (valued at $62 billion in 2025).

His blend of academic expertise and hands-on coding makes him a pivotal figure bridging research and industry in big data.

Linkedin: Matei Zaharia
X (Twitter): @matei_zaharia
Github: mateiz

Jay Kreps

We were not going to make short-term decisions. We wanted to set the company up to execute over the longer term, and there’s a really significant opportunity in the data streaming space. [If we] don’t build something for that larger opportunity, then we’re going to miss out.

Nationality: American

Jay is best known as a co-creator of Apache Kafka, the distributed streaming platform that has become a backbone for real-time data pipelines.

While at LinkedIn, he helped design Kafka to handle high-throughput event data, and later co-founded Confluent in 2014 to build a company around Kafka’s ecosystem. As Confluent’s CEO, Jay has guided its growth while remaining deeply technical – he’s authored influential papers (including on the Kappa architecture for streaming data) and continues to advocate for developer-friendly data infrastructure.

Under his leadership, Confluent has made Kafka enterprise-ready, and Jay’s vision for data streaming has shaped how modern organizations integrate and react to big data in real time.

- Linkedin: Jay Kreps
- X (Twitter): @jaykreps

Neha Narkhede

Neha Narkhede - Big Data Experts: Top 17 Pros

Nationality: Indian

Neha played a key role in building the big data streaming revolution.

As a software engineer at LinkedIn, she co-created Apache Kafka in 2011 to handle the site’s massive data feed. In 2014, Neha co-founded Confluent and as CTO led its technology development, helping companies adopt Kafka for mission-critical use (from Goldman Sachs trading to Netflix recommendations). She has since been recognized as one of America’s top young technology innovators (MIT Innovators Under 35) for “teaching companies to swim” in torrents of data.

Today, Neha continues to innovate as founder of a new startup (Oscilar) and as an investor, while remaining a prominent voice in streaming analytics and an inspiration for women in big data tech.

Linkedin: Neha Narkhede
X (Twitter): @nehanarkhede
Website: nehanarkhede.com

Doug Cutting

Nationality: American

Doug is often dubbed the “father of Hadoop” for pioneering the open-source framework that ushered in the era of Big Data.

He co-created Apache Hadoop (with Mike Cafarella) by implementing Google’s MapReduce paper in open source, enabling reliable, distributed processing of huge datasets. Before Hadoop, Doug created Apache Lucene (a popular search engine library) and co-created Nutch (a web crawler), key components that influenced search and big data indexing. In the last decade, Doug served as Chief Architect at Cloudera, guiding Hadoop’s enterprise adoption. He remains an Apache Software Foundation advocate and board member, championing open-source data ecosystems.

Doug’s contributions – from HDFS storage to the MapReduce processing paradigm – are foundational to today’s big data platforms.

Linkedin: Doug Cutting
X (Twitter): @cutting

DJ Patil

Nationality: American

Dhanurjay “DJ” is a pioneer of the data science profession in industry and government.

In 2008, he (along with Jeff Hammerbacher) famously coined the job title “Data Scientist” to describe their work applying big data at LinkedIn and Facebook. DJ went on to become the first Chief Data Scientist of the United States, appointed in 2015, where he led national initiatives on data-driven policymaking. In the private sector, he has held senior roles at eBay, PayPal, LinkedIn, and later served as Head of Data Products at RelateIQ (Salesforce). DJ is known for promoting the power of open data and data ethics, and has been a top influencer shaping data strategy across industries.

He continues to build bridges between tech and society – currently as a General Partner at a venture firm – and remains a sought-after advisor for companies aiming to leverage big data responsibly and at scale.

Linkedin: D.J. Patil
X (Twitter): @dpatil

Jeremy Howard

Jeremy Howard - Big Data Experts: Top 17 Pros

Nationality: Australian

Jeremy is an Australian data scientist and entrepreneur who has achieved global recognition in both the big data competition arena and open-source education.

He was a top-ranked Kaggle competitor (winning various machine learning competitions) and later became President and Chief Scientist of Kaggle, helping grow the platform for data science contests. Jeremy co-founded fast.ai, a research lab and online learning platform making deep learning more accessible. Through fast.ai he has developed the popular fastai library and taught thousands of students practical machine learning. Previously, he founded analytics startup Optimal Decisions (acquired in 2011) and led data products at Singularity University.

Jeremy’s current focus is on democratizing AI; his influential courses and MOOC have enabled many to enter AI by leveraging big data. He’s a frequent keynote speaker and was recognized as a Young Global Leader by the WEF for his contributions.

Linkedin: Jeremy Howard
X (Twitter): @jeremyphoward
Website: jeremy.fast.ai

Wes McKinney

Nationality: American

Wes is the software developer behind pandas, the Python data analysis library ubiquitous in data science.

He created pandas in 2008 to simplify working with tabular data in Python, and it has since become a cornerstone tool for data manipulation. In recent years, Wes led the design of Apache Arrow, an open-source columnar memory format accelerating big data interoperability. He co-founded Voltron Data (2020) to unify and advance the Arrow ecosystem across languages. Previously, Wes worked at Two Sigma and Cloudera, and authored the definitive book Python for Data Analysis. His work as pandas “BDFL” (Benevolent Dictator for Life) and Arrow champion has massively improved the productivity of data engineers and scientists dealing with large datasets.

Wes continues to innovate in high-performance computing, focusing on making big data processing faster and more accessible in the Python community.

Linkedin: Wes McKinney
X (Twitter): @wesmckinn
Github: wesm
Website: wesmckinney.com

Shay Banon

Nationality: Israeli

Shay is the original author of Elasticsearch, the open-source distributed search and analytics engine that has become a de facto standard in big data search technology.

He developed Elasticsearch in 2010 (inspired by a project to help his wife search recipes) and open-sourced it, later co-founding Elastic to support and expand the stack (Elasticsearch, Kibana, Beats, Logstash). As CEO and now CTO of Elastic, Shay oversaw the company’s growth from a small project to a public company and a vibrant community, with Elastic’s products used for everything from enterprise log analysis to security data lakes. He has remained deeply involved in the technical roadmap, guiding features like real-time distributed indexing and search scalability.

Shay’s work has greatly democratized powerful search and querying of big data, enabling organizations worldwide to turn large datasets into actionable insights quickly.

Linkedin: Shay Banon
X (Twitter): @kimchy

Maxime Beauchemin

Maxime Beauchemin - Big Data Experts: Top 17 Pros

Nationality: French

Maxime has built some of the most widely used open-source tools in data engineering.

He created Apache Airflow in 2014 while at Airbnb, to programmatically orchestrate complex data pipelines; Airflow is now a top project for ETL workflow scheduling. He also created Apache Superset, a popular open-source business intelligence and data visualization platform. Maxime later founded Preset (2019) to bring Superset to the cloud and continue evolving modern data exploration. With past stints at Facebook, Airbnb, and Lyft, Maxime has deep practical insight into scaling data systems. He’s an active blogger (“The Rise of the Data Engineer” is one of his noted essays) and speaks frequently on the state of data tooling.

By open-sourcing Airflow and Superset, Maxime empowered thousands of organizations to build reliable data pipelines and democratize data insights without expensive proprietary software.

Linkedin: Maxime Beauchemin
X (Twitter): @mistercrunch

Kirk Borne

Nationality: American

Dr. Kirk Borne is a globally recognized big data evangelist who has consistently ranked among the top worldwide influencers in data science and AI since 2013.

An astrophysicist by training (he was a NASA researcher on the Hubble mission), Kirk transitioned to data science and helped pioneer the use of big data in astronomy. He served as Principal Data Scientist at Booz Allen Hamilton, advising large enterprises on data strategy, and is now Chief Science Officer at DataPrime. Kirk is extremely active on social media and blogging – with over 300k followers on X (Twitter), he shares insights on big data, machine learning, IoT, and data literacy daily. He also mentors at universities and speaks at dozens of conferences, spreading best practices in data management and analytics.

For his contributions to data advocacy and education, Kirk Borne has been consistently hailed as a leading voice making big data approachable and exciting for broad audiences.

Linkedin: Kirk Borne
X (Twitter): @KirkDBorne
Website: Data Leadership Group

Matthew Rocklin

Nationality: American

Matthew is the initial author of Dask, a flexible Python library for parallel computing that scales popular data science workflows to multi-core machines and clusters.

Created in the mid-2010s, Dask has become a vital tool to extend PyData (NumPy, pandas, scikit-learn) for “big data” scenarios by distributing work across clusters. Matthew led Dask’s development first at Anaconda and NVIDIA, and in 2020 he founded Coiled to provide cloud-hosted Dask solutions. At Coiled (where he is CEO), he continues to write code, focusing on improving Python’s scalability for big data analytics. Matthew is an open-source pragmatist – he actively maintains many Dask-related projects and engages the community through blog posts and talks. With a PhD in physics, he brings scientific rigor to computing.

By enabling Python devs to handle large datasets with familiar tools, Rocklin’s work significantly lowers barriers in big data computing for scientists and businesses alike.

Linkedin: Matthew Rocklin
X (Twitter): @mrocklin
Github: mrocklin
Website: matthewrocklin.com

Zhamak Dehghani

Zhamak Dehghani - Big Data Experts: Top 17 Pros

Nationality: Iranian-American

Zhamak is known for introducing the concept of Data Mesh in 2019, a paradigm shift in big data architecture.

While a technology director at ThoughtWorks, she identified challenges in monolithic data lakes and proposed Data Mesh as a decentralized, product-driven approach to make enterprise data more agile and scalable. Her thought leadership (through influential articles and a 2022 book “Data Mesh: Delivering Data-Driven Value at Scale”) has sparked a global movement among companies to reorganize how they manage analytics data. Zhamak recently founded Nextdata (2022) to build platforms supporting data mesh principles. With a background in software engineering and distributed systems, she remains a hands-on innovator.

Zhamak is a frequent keynote speaker and blogger on data architecture and has quickly become one of the most respected thought leaders in big data design, helping enterprises treat data as a product, not just an afterthought.

Linkedin: Zhamak Dehghani
X (Twitter): @zhamakd

Abhishek Thakur

Nationality: Indian

Abhishek is a superstar in the competitive data science world and an influential content creator.

He earned fame by becoming the world’s first Quadruple Grandmaster on Kaggle – achieving top-tier Grandmaster status in Kaggle’s four categories of competition, kernels, discussions, and datasets. This reflects dozens of gold medals and winning solutions in machine learning contests. Abhishek has applied this expertise in industry as well, previously serving as Chief Data Scientist at Boost.ai and currently building AutoML tools at Hugging Face. He is the author of the popular book “Approaching (Almost) Any Machine Learning Problem” (2020) where he shares pragmatic advice from his competition experience.

Abhishek also runs a YouTube channel and blog where he breaks down complex ML topics for a wide audience. His achievements in global competitions and passion for knowledge-sharing have cemented him as a big data and ML influencer in the community.

Linkedin: Abhishek Thakur
X (Twitter): @abhi1thakur
Kaggle: abhishek

Sidhant Dorge

Nationality: Indian

Sidhant is a Lead Software Engineer at Persistent Systems based in Pune, and a passionate advocate for Big Data and AI/ML solutions.

In his LinkedIn article titled “Big Data Technology Stack” he outlines a practical, end‑to‑end technology ecosystem (from data ingestion and storage frameworks to distributed processing engines and visualization tools) that reflects his hands‑on experience with scalable data pipelines. As a Python and Mendix developer turned Big Data enthusiast, Dorge has applied that stack knowledge in real-world software engineering projects, integrating cloud-native tools and platforms to drive analytics at scale.

His professional trajectory showcases a blend of engineering depth, thought leadership in Big Data infrastructure, and active engagement with AI/ML trends.

Linkedin: Sidhant Dorge

Ali Ghodsi

maxresdefault 10 1 - Big Data Experts: Top 17 Pros

Nationality: Swedish-Iranian

Ali Ghodsi is the CEO and co-founder of Databricks, a leading big data and AI platform company valued at $62 billion in 2025.

With a PhD in distributed computing, Ali was an early contributor to the Apache Spark project that Databricks emerged from, working alongside its creators at UC Berkeley. He helped design aspects of Spark’s cluster management (he also co-created Apache Mesos in academia). Transitioning from academia to industry, Ali has led Databricks since 2016, overseeing its growth and the development of the Unified Data Analytics Platform (which integrates Spark with Delta Lake, MLflow, etc.). He is known for his product vision and for keeping Databricks closely tied to open-source innovation. Ali is also an Executive Chairman of Anyscale (company for the Ray project), reflecting his continued passion for cutting-edge distributed systems.

His unique journey from researcher to CEO exemplifies how to turn big data research into impactful enterprise technology.

Linkedin: Ali Ghodsi
X (Twitter): @alighodsi

Jeff Dean

Nationality: American

Jeff is one of the key architects behind Google’s large-scale data and machine learning infrastructure.

As a Senior Fellow and Chief Scientist at Google DeepMind, he has co-designed foundational systems such as MapReduce, Bigtable, Spanner, and TensorFlow. These technologies underpin Google’s ability to process massive datasets reliably and at global scale. Jeff has authored hundreds of highly cited research papers and has played a central role in bridging distributed systems and machine learning.

His work has influenced nearly every major big data platform, directly or indirectly, setting patterns that the broader industry continues to follow.

Linkedin: Jeff Dean
X (Twitter): @JeffDean

Wrap Up

These experts represent exceptional talent, making them extremely challenging to headhunt. However, there are thousands of other highly skilled IT professionals available to hire with our help. Contact us, and we will be happy to discuss your hiring needs.

Note: We’ve dedicated significant time and effort to creating and verifying this curated list of top talent. However, if you believe a correction or addition is needed, feel free to reach out. We’ll gladly review and update the page.