Saturday 28 March 2020

Book review: A tutorial introduction to the mathematics of deep learning

Artificial Intelligence Engines:
An introduction to the Mathematics
of Deep Learning
by Dr James V. Stone
the book and Github repository.
(c) 2019 Sebtel Press
Deep learning and associated connectionist approaches are now applied routinely in industry and academic research from image analysis to natural language processing and areas as cool as reinforcement learning. As practitioners, we use these techniques and utilise them from well designed and tested reliable libraries like Tensorflow or Pytorch as shipped black-boxed algorithms. However, most practitioners lack mathematical foundational knowledge and core algorithmic understanding. Unfortunately, many academic books and papers try to make an impression of superiority show subliminally and avoid a simple pedagogical approach. In this post we review, a unique book trying to fill this gap with a pedagogical approach to the mathematics of deep learning avoiding showing of mathematical complexity but aiming at conveying the understanding of how things work from the ground up. Moreover, the book provides pseudo-codes that one can be used to implement things from scratch along with a supporting implementation in Github repo. Author Dr James V. Stone, a trained cognitive scientist and researcher in mathematical neuroscience provides such approaches with other books many years now, writing for students, not for his peers to show off. One important note that this is not a cookbook or practice tutorial but an upper-intermediate level academic book.

Building associations and classify with a network

The logical and conceptual separation of associations and classification tasks are introduced in the initial chapters. It is ideal to start with from learning one association with one connection to many via gentle introduction to Gradient descent in learning the weights before going to 2 associations and 2 connections. This reminds me of George Gamow's term 1, 2 and infinity as a pedagogical principle. Perceptron is introduced later on how classification rules can be generated via a network and the problems it encounters with XOR problem.

Backpropagation, Hopfield networks and Boltzmann machines

Detail implementation of backpropagation is provided from scratch without too many cluttering index notation in such clarity. Probably this is the best explanation I have ever encountered. Following chapters introduced Hopfield networks and Boltzmann machines from the ground up to applied level. Unfortunately, many modern deep learning books skip these two great models but Dr Stone makes these two models implementable for a practitioner by reading his chapters.  It is very impressive. Even though I am a bit biased in Hopfield networks as I see them as an extension to Ising models and its stochastic counterparts, but I have not seen anywhere else such explanations on how to use Hopfield networks in learning and in a pseudo-code algorithm to use in a real task.

Advanced topics

Personally, I see the remaining chapters as advanced topics: Deep Boltzmann machines, variational encoders, GANs and introduction to reinforcement learning. Probably exception of deep backpropagation in Chapter 9. I would say what is now known as deep learning now was the inception of the architectures mentioned in sections 9.1 till 9.7.

Glossary, basic linear algebra and statistics

Appendices provide a fantastic conceptual introduction to jargon and basics to main mathematical techniques. Of course, this isn't a replacement to fully-fledged linear algebra and statistics book but it provides immediate concise explanations.

Not a cookbook: Not import tensorflow as tf book

One other crucial advantage of this book is that it is definitely not a cookbook. Unfortunately, almost all books related to deep learning are written in a cookbook style. This book is not. However, it is supplemented by full implementation in a repository supporting each chapter, URL here.


This little book archives so much with down to earth approach with introducing basic concepts with a respectful attitude, assuming the reader is very smart but inexperience in the field. If you are a beginner or even experienced research scientist this is a must-have book.  I still see this book as an academic book and can be used in upper-undergraduate class as the main book in an elective such as
"Mathematics of Deep Learning".

Enjoy reading and learning from this book. Thank you, Dr Stone, for your efforts on making academic books more accessible.

Disclosure: I received a review copy of the book but I have bought another copy for a family member. 

Sunday 22 March 2020

Computational Epidemiology and Data Scientists: Don't post analysis on outbreak arbitrarily


Many data scientist are trained or experienced in using tools to do statistical modelling, forecasting or machine learning solutions, this doesn't necessarily mean that they should just jump out and do an ad-hoc analysis on the available public data on the covid19 outbreak and draw policy conclusions and publish them in their blogs or the other medium.  Rule of thumb of doing such thing you should have at least one published paper, article or software solution related to outbreaks appeared before December 2019. Please be considerate, epidemiological modelling is not merely fitting exponential distribution.

Please refrain on posting a blog or similar posts on infection modelling and giving advice out of your ad-hoc data analysis you did over your lunch-break, if you have not worked on computational epidemiology before. There is a vast academic literature on computational epidemiology. Let people experts in those fields express their modelling efforts first. Let us value expertise in an area.

Appendix: Computational epidemiology introductory resources

Here we provide, limited pointers to computational epidemiology literature. Google Scholar is your friend to find many more resources.

  • Computational Epidemiology
    Madhav Marathe, Anil Kumar S. Vullikanti
    Communications of the ACM, July 2013, Vol. 56 No. 7, Pages 88-96 doi
  • Broadwick: a framework for computational epidemiology
    O’Hare, A., Lycett, S.J., Doherty, T. et al. Broadwick
    BMC Bioinformatics 17, 65 (2016).
  • Mathematical Tools for Understanding Infectious Disease Dynamics
    (Princeton Series in Theoretical and Computational Biology)
    Odo Diekmann, Hans Heesterbeek, and Tom Britton
    Princeton Press
  • Agent-Based Simulation Tools in Computational Epidemiology
    Patlolla P., Gunupudi V., Mikler A.R., Jacob R.T. (2006)
  • DIMACS 2002-2011 Special Focus on Computational and Mathematical Epidemiology Rutgers working group
  • Containment strategy for an epidemic based on fluctuations in the SIR model
    Philip Bittihn, Ramin Golestanian
    Oxford/Max Planck
  • SIAM Epidemiology Collection (2020)
  • The collection provided by the American Physical Society (APS) Physical Review COVID-19 collection
  • Modeling epidemics by the lattice Boltzmann method Alessandro De Rosis Phys. Rev. E 102, 023301
  • Implications of vaccination and waning immunity, Heffernan-Keeling,  2009 Jun 7; 276(1664): 2071–2080, url

(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License