Top 20 Data Science Experts You Should Know

data science experts - Top 20 Data Science Experts You Should Know

The data science landscape is filled with practitioners who combine coding prowess, real-world impact, and community leadership.

Below is a curated (and globally diverse) list of 20 top data science experts currently active in industry, selected for their open-source contributions, startup leadership (while still coding), influential writing and social presence, roles at major tech companies, and competitive achievements. These individuals are not primarily academics, but hands-on professionals pushing the field forward.

  1. Dr. Vincent Granville
  2. Cassie Kozyrkov
  3. Hilary Mason
  4. Hadley Wickham
  5. DJ Patil
  6. Monica Rogati
  7. Tianqi Chen
  8. Chip Huyen
  9. Soumith Chintala
  10. Benn Stancil
  11. Andrej Karpathy
  12. Abhishek Thakur
  13. Barr Moses
  14. Kirk Borne
  15. François Chollet
  16. Cathy O’Neil
  17. Wes McKinney
  18. Bojan Tunguz
  19. Claudia Perlich
  20. Gabriela de Queiroz

Now, let’s delve deeper into their remarkable contributions and impact.

Dr. Vincent Granville

YouTube Video

Nationality: French-American

Vincent Granville is a veteran data scientist and one of the early influencers in the data mining and machine learning field. He co-founded Data Science Central in 2012, which became one of the most popular online communities for data science professionals (acquired by TechTarget in 2020).

Through Data Science Central, and now his blog MLTechniques.com, Vincent has published hundreds of articles covering topics from machine learning recipes and statistical tricks to career advice for data scientists. He has over 20 years of industry experience, including stints at tech companies and banks. Vincent’s background in mathematics (he was a postdoc in number theory) gives him a deep theoretical perspective, yet he’s adept at explaining concepts in accessible terms. He’s also authored books and holds multiple patents in machine learning. In recent years, Vincent styles himself as a Chief AI Scientist, focusing on emerging areas like generative AI and synthetic data. His consistent output and willingness to engage with readers have made him a respected elder statesman in the data science world.

Cassie Kozyrkov

Don’t ask your data scientists to decide for you. Ask them to inform your decisions.

Nationality: South African

Cassie Kozyrkov pioneered the field of Decision Intelligence at Google, where she served as the company’s first Chief Decision Scientist. A statistician by training, Cassie has trained over 20,000 Googlers in data-driven decision making.

She is an influential voice in the data community, known for her vivid analogies and clear explanations of data science concepts on her blog and Medium articles. Cassie’s work bridges data science and business strategy, emphasizing how to make decisions with data rather than just building models. Originally from South Africa, she advocates for diversity in tech and regularly shares practical wisdom (and humor) with her large following.

Hilary Mason

Nationality: American

Hilary Mason is an acclaimed data scientist turned entrepreneur with a track record of bridging research and industry. She was Chief Scientist at Bitly, where she built systems to analyze massive real-time data streams.

In 2014, Hilary co-founded Fast Forward Labs, an applied machine intelligence research company that helped organizations prototype with emerging AI techniques. (Fast Forward Labs was later acquired by Cloudera.) Hilary is now co-founder and CEO of Hidden Door, a startup using generative AI for gaming and interactive storytelling. A prominent speaker and mentor, she’s known for explaining complex data topics in approachable ways. Hilary also founded the Data Science community Hack NYC and has been a relentless advocate for practical AI innovation and ethical tech.

Hadley Wickham

Hadley Wickham - Top 20 Data Science Experts You Should Know

Nationality: New Zealander

Hadley Wickham is the Chief Scientist at Posit (formerly RStudio) and the mastermind behind the tidyverse—a collection of R packages that fundamentally improved data science workflows in R. He created libraries beloved by statisticians and analysts: ggplot2 for data visualization, dplyr for data manipulation, tidyr for data tidying, and many more.

As the leader of the tidyverse team, Hadley’s design philosophy has been to make R more accessible and powerful for data analysis, emphasizing coherent APIs and user experience. He has authored multiple books (like “R for Data Science”) that have become standard references for R practitioners. Hadley is originally from New Zealand and is known for fostering a welcoming, inclusive community around R – exemplified by initiatives like R-Ladies and his patient engagement with users on forums. His impact on the tooling for data science is immense, and he continues to innovate at the intersection of software engineering and statistics.

DJ Patil

Nationality: American

DJ Patil has been a pioneer of the data science profession in both industry and government. He was the first U.S. Chief Data Scientist, appointed by President Obama (2015–2017) to help harness data for public good.

Prior to that, DJ led the development of data products at LinkedIn as its Chief Scientist, where he co-authored the influential article “Data Scientist: The Sexiest Job of the 21st Century.” He’s also held key roles at eBay, Skype, and PayPal. After government service, DJ returned to industry and helped build technology at the healthcare startup Devoted Health. He continues to advise companies and speak about data strategy. With a career spanning academia, big tech, startups, and the public sector, DJ is celebrated for proving how data science can deliver real-world impact on a grand scale.

Monica Rogati

Nationality: American

Monica Rogati is known for turning data into products at some of Silicon Valley’s top companies. As one of the early data scientists at LinkedIn, she built the original job recommendation engine and the “People You May Know” feature, helping to drive LinkedIn’s growth.

She later became Vice President of Data at Jawbone, where she applied data science to wearable tech and consumer insights. Now an independent AI/Data Science advisor, Monica works as a “fractional Chief Data Officer” to multiple startups, guiding their data strategy. She frequently shares her “AI Hierarchy of Needs” framework, illustrating the data foundations companies must have before AI. Monica’s blend of hands-on algorithm development and C-suite advising has made her a respected thought leader on how to integrate data science into business innovation.

Tianqi Chen

Tianqi Chen - Top 20 Data Science Experts You Should Know

Nationality: Chinese

Tianqi “TQ” Chen is a prodigious systems engineer who has built some of the most important engines in machine learning. He is the original creator of XGBoost, the gradient boosting library that became a go-to tool for Kaggle champions and data scientists for tabular data.

Tianqi also co-developed Apache MXNet (a deep learning framework) and Apache TVM (a deep learning compiler), showcasing his talent for optimizing ML on different hardware. He co-founded OctoML, a startup focused on automating model optimization and deployment, bringing cutting-edge research to enterprise applications. Recently, Tianqi joined Carnegie Mellon University as an assistant professor, bridging academia and industry. At CMU he leads the MLC (Machine Learning Compilation) group, pushing forward techniques to run large models efficiently anywhere. His work exemplifies engineering excellence in data science – enabling algorithms to run faster and at scale.

Chip Huyen

Nationality: Vietnamese

Chip Huyen (Huyen Chip) is a prominent figure in the machine learning engineering community, known for her focus on MLOps and production ML systems. She co-founded Claypot AI, a startup that built a platform for real-time machine learning (the company was acquired by data infrastructure firm Voltron Data in 2023).

Chip is also the author of “Designing Machine Learning Systems” (2022), an O’Reilly book that became a #1 bestseller and is praised for its practical guidance on building and deploying AI products. She has worked at Snorkel AI and NVIDIA, and she teaches Machine Learning Systems Design at Stanford, sharing her curriculum openly with the community. Chip’s blog posts on topics like data distribution shifts, model monitoring, and why end-to-end ML pipelines fail have earned her a large following. Through her writing and open-source snippets, Chip advocates for making ML robust in the real world, emphasizing latency, feedback loops, and continuous learning.

Soumith Chintala

Nationality: Indian

Soumith Chintala is a deep learning engineer best known as the co-founder of PyTorch, the wildly popular open-source deep learning framework. At Meta AI (Facebook), Soumith led the team that created PyTorch, which by 2019 had become the tool of choice for researchers and industry practitioners alike due to its flexibility and ease of use.

He remains a lead maintainer of PyTorch, guiding its evolution (including the recent move to become part of the Linux Foundation). Before PyTorch, Soumith was an active contributor to Torch7 and worked on computer vision research — he co-authored early significant papers on Generative Adversarial Networks (GANs). He is also passionate about robotics and was involved with projects at NYU. Soumith’s impact lies in both the code he writes and the community he’s fostered: he has championed open-source AI and collaboration, making advanced AI capabilities accessible to all.

Benn Stancil

Benn Stancil - Top 20 Data Science Experts You Should Know

Nationality: American

Benn Stancil is a prominent data analyst and thought leader who has shaped how organizations approach analytics. He co-founded Mode Analytics in 2013 and served as its Chief Analytics Officer, building a platform widely used for collaborative data exploration and reporting.

(Mode was acquired by ThoughtSpot in 2022, a testament to its impact in the modern data stack.) Benn became well-known beyond his company through his writing – his weekly Substack newsletter is a must-read in the data community, offering witty and insightful takes on data work, tools, and culture. He has penned essays on topics like the role of dashboards, why data teams need better product thinking, and the societal impacts of metrics. With a knack for storytelling, Benn often uses humor and real-world anecdotes to get his points across. He remains an active voice on “Analytics Twitter” and continues to push for analytics to be more human, impactful, and fun.

Andrej Karpathy

If you torture the data long enough, it will confess to anything. But don’t confuse correlation with causation.

Nationality: Slovak-American

Andrej Karpathy is a household name in AI, known for his work at the cutting edge of both AI research and applied AI in industry. A Stanford PhD alumnus under Fei-Fei Li, he was one of the early researchers at OpenAI. Andrej then became the Director of AI at Tesla, where he led the computer vision team for Autopilot, developing deep learning systems for self-driving cars.

Many credit him for bringing academic AI research rigor into Tesla’s fast-paced engineering environment. After four years at Tesla, Andrej returned to OpenAI in 2023 to work on GPT-4 and beyond, before departing to explore new projects (including a rumored AI-driven education startup called Eureka). He’s an enthusiastic teacher – his online course “Convolutional Neural Networks for Visual Recognition” and his blog posts (like “A Hacker’s Guide to Neural Networks” and the famous “Software 2.0” essay) have influenced thousands. Andrej’s blend of research skill, coding ability (he’s known to write and open-source actual training code, e.g. minGPT), and communication clarity make him a role model for industry AI practitioners.

Abhishek Thakur

Nationality: Indian

Abhishek Thakur holds the distinction of being the world’s first Quadruple Grandmaster on Kaggle (across competitions, notebooks, discussions, and datasets). An engineer by background, Abhishek has won multiple Kaggle competitions and is revered for sharing solutions and mentoring others.

He authored the popular book “Approaching Almost Any Machine Learning Problem”, which distills practical tricks from his competition experience. Until recently, Abhishek worked at Hugging Face, where he built AutoML tools (such as AutoTrain) to automate model building. In 2023, he embarked on a new startup (Arcee AI) to simplify AI integrations. He also runs a YouTube channel with over 100k subscribers and is active on LinkedIn, where he demystifies advanced techniques for a broad audience. Abhishek exemplifies the competitive spirit and open knowledge-sharing that define the Kaggle generation of data scientists.

Barr Moses

Barr Moses - Top 20 Data Science Experts You Should Know

Nationality: Israeli

Barr Moses is the co-founder and CEO of Monte Carlo, a pioneer in the fast-growing field of data observability. Monte Carlo (founded in 2019) created a new category of tools to tackle “data downtime” – periods when data is missing or broken – by monitoring data pipelines for reliability.

Barr’s vision has resonated: Monte Carlo achieved unicorn status and has been widely adopted by data teams to ensure data trustworthiness. Barr’s background is unique: she served as an officer in the Israeli Air Force, then worked in Bain’s consulting, and later led customer operations at Gainsight – experiences that gave her a multifaceted view on leadership and operational excellence. She co-authored the book “Data Quality Fundamentals” and often speaks about how organizations can become data-driven safely. As a female founder/CEO in data, Barr is also active in mentoring and highlighting women in tech. Her contributions underscore the importance of making data reliable and actionable for businesses.

Kirk Borne

Nationality: American

Kirk Borne is a veteran data scientist who transitioned from a 20-year career at NASA to become one of the most influential data science evangelists. During his NASA years, Kirk was the Project Scientist for the Hubble Space Telescope’s data archive, among other roles, supporting major space science missions.

He later served as a professor of astrophysics and then as Principal Data Scientist at Booz Allen Hamilton, where he helped enterprises adopt data analytics. Kirk is now the Chief Science Officer at an AI startup (most recently at DataPrime) and the founder of Data Leadership Group. As a LinkedIn Top Voice with hundreds of thousands of followers, he shares insights on everything from big data strategy to AI for good. His approachable style and passion for data literacy have inspired many to enter the field.

François Chollet

Nationality: French

François Chollet is the creator of Keras, one of the most widely used open-source libraries in deep learning. He released Keras in 2015 as a user-friendly API for building neural networks, which helped catalyze the growth of AI by enabling higher-level model development.

François worked as a Research Scientist at Google for nearly a decade, where he contributed to the Google Brain team and authored influential papers (including the Xception CNN architecture). In 2023, he left Google to start a new AI venture, but remains deeply involved in guiding Keras’s future. Beyond coding, François has written the book “Deep Learning with Python” and often shares philosophical musings on AI progress. He has a massive following on Twitter, where his concise observations on intelligence and learning spark discussion.

Cathy O’Neil

Cathy ONeil - Top 20 Data Science Experts You Should Know

Nationality: American

Cathy O’Neil is a data scientist-turned-activist who has shone a light on the ethical pitfalls of big data. With a Harvard PhD in math, she started her career as a quant on Wall Street and later worked as a data scientist at startups. However, Cathy is best known for her 2016 book “Weapons of Math Destruction”, a seminal critique of how algorithms can amplify inequality and harm the vulnerable.

This bestselling book (longlisted for the National Book Award) made terms like algorithmic bias part of the public conversation. After witnessing such issues firsthand, Cathy founded ORCAA (O’Neil Risk Consulting & Algorithmic Auditing), one of the first companies offering algorithm audits to companies and governments. Through ORCAA, she and her team assess high-stakes AI systems (from hiring algorithms to predictive policing software) and recommend improvements. Cathy also writes op-eds for Bloomberg and Slate, and is a familiar commentator in media. Her work exemplifies the principle that with great data power comes great responsibility, urging the tech world to account for fairness and transparency.

Wes McKinney

Nationality: American

Wes McKinney is a giant in the data science software world—the original author of pandas, the ubiquitous Python library for data analysis. Wes created pandas in 2008 to make data manipulation faster and more intuitive in Python, and it has since become a cornerstone of the PyData ecosystem.

He later co-created Apache Arrow, a framework for high-performance in-memory data exchange, and Ibis, a tool for Python data analysis on SQL engines. Wes co-founded Voltron Data, a company dedicated to accelerating open-source analytics (now serving on its advisory board). Currently, he is a Principal Architect at Posit (formerly RStudio) and remains an Apache Software Foundation member. Wes has also authored the reference book “Python for Data Analysis.” His work is driven by a vision of more efficient, interoperable data tools, and he continues to lead and inspire in open-source communities.

Bojan Tunguz

Nationality: Bosnian-American

Bojan Tunguz is a data scientist and one of the elite Kaggle Grandmasters who has excelled across all of the platform’s categories. With a PhD in physics from Stanford, Bojan brings analytical rigor to his machine learning work.

He has 7 gold medals in Kaggle competitions and at one point ranked in the world’s top 10 for contests. Bojan spent several years as a Senior Data Scientist at NVIDIA, where he applied his expertise to GPU-accelerated machine learning and often spoke about bridging research and practice. Recently, he became Head of Data Science at a startup (FeatureByte), reflecting his passion for building data products. Bojan is also a prolific reviewer and contributor in the online ML community – known for explaining solutions in forums and even for being a top Amazon book reviewer in his spare time. His journey from war-torn Bosnia to the pinnacle of competitive data science is an inspiration, showing how diverse experiences can fuel success in AI.

Claudia Perlich

Claudia Perlich - Top 20 Data Science Experts You Should Know

Nationality: German

Claudia Perlich is a data scientist with a remarkable track record in both competitions and business applications of data mining. She first made waves by winning the KDD Cup (a prestigious data mining competition) three years in a row in the late 2000s, showcasing her ability to build predictive models across domains.

Claudia was the Chief Scientist at the ad-tech firm Dstillery (formerly Media6Degrees) from 2010 to 2017, where she developed machine learning algorithms for targeted advertising at scale. Her work there demonstrated how sophisticated models could drive marketing performance while she also openly cautioned about pitfalls like bias in advertising data. Currently, Claudia works as a Senior Data Scientist at Two Sigma, a leading quantitative hedge fund, applying machine learning in finance. She also frequently shares her expertise in conferences and webinars – known for emphasizing the importance of intuition and understanding data context alongside algorithms. As one of the earlier women in data science to gain prominence, Claudia remains a role model, balancing cutting-edge technical skill with real-world problem-solving savvy.

Gabriela de Queiroz

Nationality: Brazilian

Gabriela de Queiroz is a leader in the data science community, especially known for her advocacy of diversity and inclusion in tech. Hailing from Brazil, Gabriela founded R-Ladies in 2012 – a now-global organization that has trained and supported thousands of women in learning R programming and data science skills.

Her initiative helped create chapters in over 50 countries, diversifying the data science talent pipeline. Professionally, Gabriela worked as Chief Data Scientist at IBM, where she led open-source AI efforts and helped teams deploy ML (she and her team contributed to projects like TensorFlow at IBM). She recently joined Microsoft as Director of AI & Cloud Advocacy, focusing on helping startups and developers succeed with AI tools. Gabriela also launched AI Inclusive, a group aimed at increasing representation in AI. Through her technical contributions, mentorship, and community building, she embodies how inclusivity and open source can drive innovation in data science.

Wrap Up

These legends represent exceptional talent, making them extremely challenging to headhunt. However, there are thousands of other highly skilled IT professionals available to hire with our help. Contact us, and we will be happy to discuss your hiring needs.

Note: We’ve dedicated significant time and effort to creating and verifying this curated list of top talent. However, if you believe a correction or addition is needed, feel free to reach out. We’ll gladly review and update the page.

Ready to get started?