Probability for Data Science

Stanley H. Chan

Digital versions PDF
Open source No
Exercises Yes
Solutions Yes, available to instructors upon request
Solution videos Yes, available online
Lecture videos and slides Yes
Python, Matlab, Julia, and R tutorials Yes
License Copyright held by the author
  • Text for an applied probability course with motivating applications in data science
  • PDF Version: 10 chapters, 687 pages
  • Print version available from Amazon
  • Supplementary resources available at the book’s website
  • For more information and to download

This text was written to support an applied probability and data science course for electrical engineering and computer science undergraduates and first-year graduate students. The author writes, “We need a book that balances theory and practice,” and the book consequently has an informal nature that aims to develop motivation and insight. A more mathematical audience will find that some terminology is not standard and some presentations lack mathematical precision. Though mathematics students would benefit from prior experience with probability and linear algebra, the applications to data science are well developed with meaningful datasets and programming support.

Table of Contents

  1. Mathematical Background
  2. Probability
  3. Discrete Random Variables
  4. Continuous Random Variables
  5. Joint Distributions
  6. Sample Statistics
  7. Regression
  8. Estimation
  9. Confidence and Hypothesis
  10. Random Processes