Minimum Message Length (MML)

LA home
Computing
MML
 Basics
 Glossary
 Discrete
 Continuous
 Structured
 SMML
 KL
 Art
  Ideal Gas
 Notes
  Fisher
 Approx.
 MML history
 CSW

 Ind.Prog. (IP)

 ActaOe'07
 AI06
 ACSC06
 JFP05
 IDEAL03(a)
 IDEAL03(b)
 ACSC03

Also see
Bioinformatics

Minimum message length (MML) inference was devised by Chris Wallace and David Boulton c1968 and developed by Chris Wallace and many colleagues. MML is a Bayesian method of inference:

Bayes's theorem:
pr(H&D) = pr(H).pr(D | H) = pr(D).pr(H | D)
pr(H | D) = pr(H&D) / pr(D) ∝ pr(H).pr(D | H)
Shannon:
msgLen(E) = I(E) = -log2(pr(E)) bits
so
msgLen(H&D) = msgLen(H)+msgLen(D | H) = msgLen(D)+msgLen(H | D)

for hypothesis H, data D, event E. MML is a practical realisation of Ockham's razor. Some have assumed that MML is the same as maximum aposterior inference (MAP) but in general it is not. And unlike the later minimum description length (MDL) principle, MML favours explicit priors and fully parameterized models.

Key points are that every continuous (real, floating point) variable has some limited measurement accuracy and that every continuous parameter has some optimal limited precision to which it should be inferred and stated. A consequence is that even continuous data and continuous parameters have non-zero probabilities (and hence finite message lengths), not just probability densities, and therefore Bayes's theorem still applies as is. Interestingly, there are many cases where even a discrete parameter must be estimated to less precision than its discreteness would seem to allow.

Some statistical models that have been MML-ed include:

Discrete.
Binomial, 2-state.
Multinomial, k-state.
Integer.
Geometric.
Poisson.
Universal distributions.
Continuous.
Normal (Gaussian).
Linear regression.
von Mises - Fisher (vMF) and von Mises distributions.
Student's t-Distribution.
Structured.
Mixture models (clustering, classification).
(Hidden) Markov models, PFSA.
Classification- (decision-) trees and graphs etc..
Regression- and Model-trees.
Sequence segmentation.
Megalithic stone circles!
Bayesian networks.
Supervised learning.
Unsupervised learning.
Trees and Graphs.

MML has theoretical support in the shape of Kolmogorov complexity.

Strict MML (SMML) is a sort of MML "gold standard". Unfortunately SMML is computationally intractible for all but simple problems but, happily, accurate and feasible approximations to SMML exist.

www:


© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Created with "vi (Linux or Solaris)",  charset=iso-8859-1,  fetched Thursday, 24-Jul-2014 15:43:53 EST.

free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop,
Firefox web-browser, FlashBlock flash on/off.