Beginner math books for Machine Learning

I'm a Computer Science engineer with no background in statistics or advanced math.

I'm studying the book Python Machine Learning by Raschka and Mirjalili, but when I tried to understand the math of the Machine Learning, I wasn't able to understand the great book that a friend suggest me The Elements of Statistical Learning.

Do you know any easier statistics and math books for Machine Learning? If you don't, how should I move?

Topic esl mathematics reference-request statistics machine-learning

Category Data Science


Introduction to Linear Algebra is a good starting point. Make sure you are good with probability theory, linear algebra, and statistics. A very in depth knowledge may not be necessary, but having a good knowledge is required.


Although you need book, I recommend the following courses respectively for understanding statistics which are used for machine learning and other tasks in data science. They are free.

If I want to recommend a book, I would recommend the following book which is free under CC license. It has nice examples and is so much practical; moreover, there are lots of codes in it which help you feel statistics in real world examples.

Also the following link may help:


Keep in mind that while I have a Masters in Applied Statistics, I'm going to give you a very simple answer: take a course on probabilities.

Most of the modern ML programming frameworks take a large majority of the math out of data science; you really just won't need it in most scenarios. But you will always need the ability to understand your results and the majority of results are expressed in probabilities. If I was new to data science I would take a (brief) course on probabilities, seek to understand what proportions and percentages really mean and then I would work to know a framework (like Tensorflow) really, really well. If you can do that, you can write some really interesting algorithms and not have to be obsessive about the math.


I cannot tell from your question how adept you are at mathematics or where your learning stops. I'll assume since you are a computer software engineer that you're familiar with algebra, geometry, and perhaps some calculus.

I'd recommend you start your learning by reading up on statistics and understanding concepts like descriptives, exploratory data analysis, correlation, distributions, and so on. I see that you prefer books rather than videos, so I'll meet you half way and provide you with a few books that are online, as well as a book or two that you can buy in print.

First, I'd recommend Penn State's online graduate course curriculum in statistics. You can explore each of their courses using the menu on the left. Once you select a course, scroll down on the course's webpage and click on the link that reads "online course notes". The course notes for these courses are much more than notes and read like full books. They are very instructive. Also, check out Penn State's online undergraduate course curriculum in statistics, too, in case you find something in the graduate coursework that is too advanced and want a "simpler" explanation.

Second, review the Handbook of Biological Statistics by John H. McDonald. Don't let the title fool you; this book is an excellent primer on statistics and data analysis that is applicable to any domain.

Third, review The Little Handbook of Statistics by Gerard Dallal. Again, don't let the title fool you; this book is another gem that walks you through some important statistics fundamentals.

Fourth, check out the book Think Stats by Allen Downey. There's a free version online of an earlier edition; the most recent edition you'll have to buy. It's worth it though, especially if you work in Python. In this book, the author teaches you statistics and data analysis using Python to analyze real-world (toy) datasets. This is a really great book to work through.

Lastly, check out Data Science from Scratch by Joel Grus. This book focuses more on data analysis (instead of statistics fundamentals) and places a greater emphasis on machine learning and modeling. It uses Python (and the Python data science stack) to walk you through analyzing and conducting predictive analytics on real-world (toy) datasets. Another great book to work through.


Before doing my master in Analytics, I was suggested by my seniors to go through these couple of books to know more about Machine Learning and Statistics.

Namely:

  1. Discovering statistics with SPSS/R - Andy Field
  2. R Beginner and R for Everyone
  3. Predictive Analytics - The Power to Predict Who Will Click, Buy, Lie or Die
  4. Data Science for Business and many more

If you cannot find these books online, do let me know will share the link, I have them on my drive. These books helped me in understanding the basics of stats with examples explained in layman terms.

If you are looking for some online courses, let me know can suggest you couple of good courses(most of them are free).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.