Data science / machine learning books for mathematicians

I have found other requests for references here. In particular in: Where to start, which books and Books about the Science in Data Science?

I have given a glance to:

  • Artificial Intelligence: A Modern Approach (Russel Norvig)
  • Machine Learning: The Art and Science of Algorithms that Make Sense of Data (Flach)
  • Learning From Data (Abu-Mostafa et al.)
  • Introduction to Statistical Learning (James et al.)
  • Elements of Statistical Learning (Hastie et al.)
  • Pattern Recognition and Machine Learning (Bishop)

Now it is difficult to evaluate if they would fit my needs because only a few pages are generally available online. However my first impression is that they do not. In the appendices of Artificial Intelligence: A Modern Approach I can read:

Mathematicians define a vector as a member of a vector space, but we will use a more concrete definition: a vector is an ordered sequence of values.

This is exactly the kind of approach I am not looking for.

I'm looking for a book which assumes the reader has a good understanding in set theory, abstract algebra, measure and probability theory, statistics, topology, graph theory, complexity theory, etc and a preference for formal and axiomatic explanations rather than lenghty and so-called "intuitive" approaches based on basic mathematical objects and examples. Furthermore I don't want something that looks like a recipe book from the very beginning. I want a book that formalizes the abstract and common shape of all data science methods as well as their common aim first. Only after that it can start to explain the different categories by explicitely stating which further hypotheses each category is assuming and which cases/problems/domains they are known to handle efficiently or not.

At last, to be clear, I have no problem with being shown concrete examples and their treatment via a specific programming language for example. I just want this to come second as an illustration for the conceptual explanation, not as a substitute.

Topic books reference-request

Category Data Science


I would recommend

Linear Algebra and Learning from Data by Gilbert Strang

as a nice introduction for someone with an undergrad math background.

It's not particularly wide in scope and contains some probably unnecessary summaries of linear algebra (insightful nonetheless) and probability (very basic), but the portions on data are nice introductions.

You can see some sample chapters at https://math.mit.edu/~gs/learningfromdata/


I refer you to a previous question I asked on CrossValidated for a similar request and still stand by my answer to this question.

Per @Coffee's recommendation, I would recommend the text Machine Learning: A Bayesian and Optimization Perspective by Sergios Theodoridis along with Pattern Recognition by the same author.

These two texts combined are 2,000 pages total and cover everything from undergrad-level probability to linear models, and (as far as I can tell) everything covered by Elements of Statistical Learning, in addition to time series, probabilistic graphical models, deep learning, and Monte Carlo methods.

The author makes an excellent effort to make all notation clear and consistent (thank you for bolding all of your vectors!) and seems to have used carefully chosen exercises.

Having a background in probability as well as stats at the level of Casella and Berger would be extremely helpful to have before pursing these texts. There is some discussion of UMVUEs in here.


Hastie et al is at the mathematical level you require - being written by statistics academics with strong mathematical pedigree (Hastie is currently a mathematics professor, for example) - and the complete text is available for free online via the authors' website. It is probably about the best general survey of machine learning for people with mathematical and statistical background at the graduate student level. That said, it is still a survey, and individual topics will require follow up elsewhere, though useful recommended reading is provided.

Bishop also assumes a reasonable degree of mathematical maturity, although the table of contents may make the content appear more simple than it is, for example by listing a review of probability distributions including the Gaussian as Chapter 2.

Russel & Norvig isn't about machine learning or data science, but rather the wider field of artificial intelligence, in which it includes machine learning as smallish subset, and data science effectively not at all. For example, it discusses a number of different kinds of systems of pre-programmed AI approaches - the exact opposite of machine learning. It is interesting if you want to understand the wider world of automation but will do little to help you understand ML.


First of all, data only comes in so many forms that it might make sense to stick to a more "concrete definition". Data Science is necessarily practical. But here are a few other books with a more theoretical grounding. Others will certainly know many more...

However, research in machine learning is mostly found in journals and papers. It will be hard to find one or a few books that cover everything you want to know.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.