Thursday, 16 November 2017

Teaching to machines: What is learning in machine learning entails?

Preamble

Ebbinghaus (Wikipedia


Machine Learning (ML)
 is now a de-facto skill for every quantitative job and almost every industry embraced it, even though fundamentals of the field is not new at all. However, what does it mean to teach to a machine? Unfortunately, for even moderate technical people coming from different backgrounds, answer to this question is not apparent in the first instance. This sounds like a conceptual and jargon issue, but it lies in the very success of supervised learning algorithms.

What is a machine in machine learning

First of all here, machine does not mean a machine in conventional sense, but computational modules or set of instructions. It is called machine because of thermodynamics can be applied to analyse this computational modules, their algorithmic complexity can be map to an energy consumption and work production, recall Landauer Principle. Charles Bennett has an article on this principle, here.

Learning curve for machines

What about learning bit? Learning is a natural process for humans, specially for the younger ones. We learn new things naturally without explicitly thinking with instruction and some training, such as practised in University of Bologna a millennia ago. Usually,  it takes time for us to learn new thing, and the process is called learning curve, due to Hermann Ebinghauss. Essentially, learning in machine learning exactly roots in Ebinghauss approach.  But question remains, how this manifests quantitatively in machine learning. Answer to this question is going to be given here.

The concept of learning curve and  effect of sample size in training machine learning models for both experienced and novice practitioners. It is often this type of analysis is omitted in producing supervised learning solutions. In the advent of deep learning architecture, multi-layered neural networks, this concept becomes more pronounced.

Quantifying learning curve lies in measuring performance over increasing experience. If the performance of the machine, or human, increases over experience we denote that learning is achieved. We distinguish good learner.

On the misconception of unsupervised learning


Figure 2: Donald Webb,
father of unsupervised
learning.
(Source: Wikipedia)
A generic misconception appear on what unsupervised learning means. Clustering, or categorising unlabelled data, is not learning in Ebbinghaus sense. The goodness of fit or cluster validation exercise do not account an increase in experience, at least this is not established in the literature, judging from cluster validation techniques, see, Jain-Dubes's Algorithms for clustering data. Wikipedia defines unsupervised learning "inferring a function to describe hidden structure from "unlabeled" data", this is a function inference is not learning,.

Then, what is unsupervised learning? It originates from Hebbian Theory from neuroscience that "Cells that fire together wire together". This implies, unsupervised learning is about how information is encoded, not how it is labelled. One good practical model that could be classified as unsupervised learning, so called spiking network model.

Outlook

The concept of learning from Machine Learning perspective is summarised in this post. In data science, it is a good practice to differentiate what we call learning and function inference/optimisation. Being attentive to this details would help us.