Computational Thinking
There is another interesting difference that is worth pondering. Consider the problem of estimating a mixture of Gaussians. In Statistics we think of this as a solved problem. You use, for example, maximum likelihood which is implemented by the EM algorithm. But the EM algorithm does not solve the problem. There is no guarantee that the EM algorithm will actually find the MLE; it’s a shot in the dark. The same comment applies to MCMC methods. In ML, when you say you’ve solved the problem, you mean that there is a polynomial time algorithm with provable guarantees. There is, in fact, a rich literature in ML on estimating mixtures that do provide polynomial time algorithms. Furthermore, they come with theorems telling you how many observations you need if you want the estimator to be a certain distance from the truth, with probability at least 1- . This is typical for what is expected of an estimator in ML. You need to provide a provable polynomial time algorithm and a finite sample (non-asymptotic) guarantee on the estimator.
ML puts heavier emphasis on computational thinking. Consider, for example, the difference between P and NP problems. This is at the heart of theoretical Computer Science and ML. Running an MCMC on an NP hard problem is often meaningless. Instead, it is usually better to approximate the NP problem with a simpler problem. How often do we teach this to our students?