THE DATA SCIENCE HANDBOOK
Conversations With The World's Top Data Scientists
120 Data Science Interview Questions
In preparation for the launch of The Data Science Handbook in a few months, we've just published 120 Data Science Interview Questions. You can check it out at datasciencequestions.com These are 120 real questions drawn from actual data science interviews. The guide also contains insights and tips from individuals who've landed data science roles at top technology companies and financial firms. Get it at datasciencequestions.com Thanks for all your support, Carl, Max, William and Henry
Advice from top data scientists
Conducting in-depth interviews with top data scientists. Here are a few people we've talked to so far:
DJ Patil, RelateIQ
VP of Product
VP of Product in RelateIQ. DJ has held roles across academia, industry, government. Places like LinkedIn, eBay and PayPal. He's worked on national security on computational social science and helping build educational platforms in Iraq.
Hilary Mason, Accel Partners
Data Scientist in Residence
Hilary is currently the data scientist in residence at venture capital firm Accel Partners. She was formerly Chief Scientist at Bit.ly and is the co-host of New York City's Data Gotham as well as the co-founder of hackNY.
Peter Skomoroch, Formerly LinkedIn
Former Principal Data Scientist at LinkedIn
Pete was a Principal Research Scientist at LinkedIn, where he led teams of Data Scientists focused on Reputation, Inferred Identity and Data Products. He wa lead Data Scientist and creator of LinkedIn Skills & Endorsements, one of the fastest growing new products in LinkedIn's history. He is also the founder of Data Wrangling, which offers consulting services for data mining and predictive analytics. Previously, he was the Director of Advanced Analytics at Juice Analytics, Sr. Research Engineer at AOL Search, and a researcher at MIT Lincoln Laboratory. Skomoroch received a B.A., Mathematics & B.S. Physics, from Brandeis University in 2000.
Michael Hochster, LinkedIn
Director of Data Science
Michael is Director of Data Science at LinkedIn. Formerly, he has held positions at Google, working in Search and Ads, and at Microsoft. He holds a PhD in statistics from Stanford.
Jonathan Goldman, Intuit
Director of Data Science and Analytics
Jonathan is Director of Data Science and Analytics at Intuit. He co-founded Level Up Analytics, a premier data science consulting company focused on data science, big data, and analytics which Intuit acquired in 2013. From 2006–2009 he led the product analytics team at LinkedIn which was responsible for creating new data driven products. While at LinkedIn he invented the People You May Know product and algorithm which was directly responsible for getting millions of users connected and more engaged with LinkedIn. He received a PhD in physics in 2005 from Stanford where he worked on quantum computing and a BS in physics from MIT.
George Roumeliotis, Intuit
Senior Data Scientist and Data Business Partner
George Roumeliotis is a Senior Data Scientist and Data Business Partner at Intuit. After starting his scientific research career at Stanford, working in theoretical and computational plasma astrophysics, he decided to make the leap into entrepreneurship. He started two companies, Dynaptics and JRG (acquired by CDC Software), collectively raised more then 15 Million dollars in venture capital. He has created analytics-powered products in fields as diverse as advertising, manufacturing, and finance. George loves to teeter on the border of Data Science and New Business Development.
Riley Newman, AirBnB
Head of Data Science and Analytics
Riley Newman is the Head of Data Science at AirBnB. He joined AirBnB as one of the first ten employees, helping the organization scale to its current size and make rigorous data-driven decisions. Before AirBnB, Riley was a senior research associate at an economic research and forecasting firm. He graduated with honors from the University of Washington, paying his way through college by serving as part of the U.S. Coast Guard. He completed a Masters in Applied Economics at Cambridge.
Kevin Novak, Uber
Data Science & Analytics Lead
Kevin earned his MS in theoretical physics from Michigan State where his academic work involved statistical verification of heavy ion collision models. Since moving to SF and joining Uber in July 2011, he's helped build a variety of data projects, most recently building Uber's dynamic pricing product.
Jace Kohlmeier, Khan Academy
Dean of Analytics, Khan Academy
Jace Kohlmeier is the Dean of Analytics at Khan Academy. He earned degrees in Mathematics and Computer Science as a Kansas State Wildcat, and a Master’s degree in Computer Science from Princeton. He spent six years at Citadel Investment Group, where he cofounded the High Frequency Trading group and oversaw its trading in fixed income, currency, commodities and futures. He is also the cofounder of Teza Technologies, where he served as President and Head of Quantitative Research. After volunteering in 2010, he joined Khan Academy full-time on a mission to apply state-of-the-art data science toward optimized learning.
Chris Moody, Square
Chris is a data scientist at Square. He recently finished his PhD in Astrophysics at UC Santa Cruz, where he contributed to yt, a large community-developed and open-source analysis toolkit for astrophysical simulation data.
Erich Owens, Facebook
Erich finished M.S. in Applied Mathematics at Brown, and worked at NASA JPL, startups Quid and Newsle before his current role at Facebook where he is leveraging data and computer science to build data products.
Luis Sanchez, ttwick
Founder / Data Scientist
Luis has held Senior Quantitative Analyst positions at a number of financial institutions including Lehman Brothers, AIG and Deutsche Bank. In those roles, he developed machine-learning models applied to trading and structuring of securities backed by exotic assets. Luis completed his MBA as a Fulbright/LASPAU scholar and started his Wall Street career as Director of Quantitative Analysis for a macro hedge fund. At ttwick, he is developing cutting edge analytical methods for a wide range of applications.
Eithon Cadag, Ayasdi
Senior Data Scientist
Eithon Cadag is data scientist at Ayasdi, where he works on computation and analysis in the pharmaceutical and healthcare domain. Before joining Ayasdi, Eithon worked for the US government as a statistician and researcher in the biological and chemical defense space. Eithon received a PhD in Biomedical & Health Informatics from the University of Washington, where his dissertation was on data integration and statistical techniques for pathogen characterization.
Sean Gourley, Quid
Co-founder, and CTO at Quid
Sean is a Physicist, decathlete, political advisor, and TED fellow. He is originally from New Zealand where he ran for national elected office and helped start New Zealand’s first nanotech company. Sean studied at Oxford as a Rhodes Scholar where he received a PhD for his research on the mathematical patterns that underlie modern war. This research has taken him all over the world from the Pentagon, to the United Nations and Iraq. Previously Sean worked at NASA on self-repairing nano-circuits and is a two-time New Zealand track and field champion. Sean is now based in San Francisco where he is the co-founder and CTO of Quid, an augmented intelligence company.
Clare Corthell, Mattermark
Clare Corthell is a Data Scientist Developer at Mattermark, a data-driven deal intelligence platform. She is the originator of the Open Source Data Science Masters and has spent her professional career founding and working with early-stage companies in the US, Europe, and East Africa. She has designed and built products such as music apps, transit systems, wind turbines, leadership development schools, competitive intelligence, educational reading systems, and data-driven decisionmaking tools.
Diane Wu, Palantir
Diane finished her PhD in Genetics from Stanford, where she elucidated RNA editing function using bioinformatic analysis of mRNA and small RNA populations. She currently works as a Data Scientist at Palantir, where she uses the hypothesis testing and statistics to help Palantir make smarter products.
Joe Blitzstein, Harvard
Statistics & Data Science Professor
Joe is a Professor of the Practice of Statistics with the Harvard Statistics Department, moving to Harvard after obtaining his PhD in Mathematics from Stanford. He co-teaches Harvard's inaugural Data Science class and Harvard's popular introduction to probability class, Stat 110 (available online). He regularily tweets at @stat110.
Josh Wills, Cloudera
Senior Director of Data Science, Cloudera
Josh Wills is Cloudera's Senior Director of Data Science, working with customers and engineers to develop Hadoop-based solutions across a wide-range of industries. He is the founder and VP of the Apache Crunch project for creating optimized MapReduce pipelines in Java. Prior to joining Cloudera, Josh was at Google, where he worked on the ad auction system and then led the development of the analytics infrastructure used in Google+.
Michelangelo D'Agostino, Civis Analytics
Senior Data Scientist
A reformed physicist turned data scientist, Michelangelo is currently a Senior Data Scientist with Civis Analytics, where he works on statistical models and writes software for data analysis. Formerly, he worked at Braintree and Obama for America 2012. With the Obama campaign, he helped to optimize the campaign's email fundraising juggernaut and analyzed social media data.
Bradley Voytek, UCSD + Uber
Computational Neuroscience Professor, Data Evangelist @ Uber
Brad is a professor of computational neuroscience at UC San Diego and Uber's Data Evangelist. Brad earned his PhD in neuroscience from UC Berkeley, created the neuroscience meta-analytic tool and hypothesis generation system, brainSCANr, and is the world's leading expert on the zombie brain.
Mike Dewar, New York Times
Data Scientist at the New York Times R&D Lab
Mike Dewar is a Data Scientist at the New York Times R&D Lab. Mike has a PhD from the University of Sheffield, UK, where he studied the modelling of complex systems using data. His work now focuses on building tools to study behaviour. Before joining the New York Times, Mike worked at the New York tech company bit.ly, and completed postdoctoral positions at Sheffield, Edinburgh and Columbia Universities. Mike is a data ambassador for the non-profit organization DataKind, and has published widely on signal processing, machine learning and data visualization.
Kunal Punera, Bento Labs
Kunal is currently Co-Founder and CTO of Bento Labs. Previously, he was the 4th engineer at RelateIQ and led the Data Products Engineering team that built products such as Strength of Relationships and Automated Follow-up Suggestions, that mined relationship intelligence from contacts/communication data. Before RelateIQ, Kunal was a Senior Research Scientist at Yahoo! Research, where his work focused on topics in Web Search, Social Media, and Abuse Detection. He has published over 30 papers in top-tier computer science conferences and holds over 20 patents. He received his PhD in Machine Learning in 2007 from the University of Texas - Austin.
Subscribe to get notified when we launch!
Curator & Data Scientist
is a Data Scientist at Topological Data Science startup Ayasdi. He believes strongly in self-improvement through hard work and perseverance. He is also fascinated with the big picture - and communicating subtle patterns through emphatic writing.
Curator & Data Scientist
is the author of The Product Manager's Handbook, containing interviews from Product Managers at companies like Twitter, Facebook and Google. He holds a bachelors in Statistics with high honors from UC Berkeley and was selected as a 2014 Data Science for Social Good Fellow.
Storyteller & Data Scientist
is a senior at Harvard studying statistics, and will be joining Quora as a Data Scientist after graduation. He likes learning, teaching and telling stories with data. You can follow him on Twitter at: @wzchen or on his Quora blog: Data Stories.
does conventional and alternative energy growth investing with New Zealand's sovereign wealth fund. He takes a simulated annealing approach towards life and has spent considerable time in the U.S., Asia, and South America.
FRIENDS DONT LET FRIENDS STRUGGLE TO BECOME DATA SCIENTISTS
Share with a friend in interested in data science today!
Data Stories: Our Blog
Support me by purchasing this!