This post is the continuation of the earlier “The Three Pillars of Reasoning, Part I”, where we consider the three forms of reasoning, namely deduction, induction, and abduction, and then zoom out to a more general context to help us to better understand their strengths and weaknesses, and to promote critical thinking.
Generalized Curve-Fitting.
Induction is a general form of reasoning that allows us to make inferences about the future based on observations or instances of the past.
{observations} → general pattern
Curve fitting is a more specific problem in which we try to find a mathematical function that best fits a set of data points.
{data cloud} → mathematical formula
However, induction can be seen as a generalized curve-fitting problem if we take observations as data.
The data may also include time dimension, in which case the generalized curve-fitting becomes an extrapolation or prediction. Such problems are studied in machine learning or statistical learning (we will use the two terms interchangeably, although some may argue that there are differences).
In fact, this is the most successful application area of AI, given the availability of big data and the variety of sophisticated machine learning algorithms such as deep learning.
Learning by Example is another paradigm subsumed under induction.
Philosophically, however, one should not forget that the problem of induction, as formulated by David Hume in the 18th century remains unsolved. Hume argued that inductive reasoning is unreliable because it relies on an unstated assumption that the future will resemble the past. There is no guarantee that the “model” will not change.
Statistical Small-World Models
To understand what the model is, we need to zoom out and consider the wider context of statistical learning.
Looking at:
{data cloud} → mathematical formula,
a statistical model provides answers to how the data cloud was generated and what family of formulas are all the possible solutions.
For example, the data could be normal distributions with certain parameters and the family of formulas are all linear functions or splines or neural network activation functions ( e.g. ReLU, sigmoid, tanh, Softmax).
Given a statistical model, machine learning algorithms can produce from the data a probabilistic function to fit the data.
Induction is always relative to the assumed model, just as deduction is relative to the logical system.
The statistical model is like a set of simplified assumptions about the world approximating the real world. The model is sometimes called a “Small-World Model”, and combined with Big Data, it can have many practical applications, despite the model being wrong.
However, we must always remember that fixed models can drift further and further away from the real-world model and so become more inaccurate, as time goes by, as Hume pointed out.