Snob

LA home
Computing
MML
 Glossary
 Notes
  SNOB

Also see
 CSW
  vanilla-snob
  factor-snob
and
 Structured
  Mixtures
SNOB is written in Fortran 77, and more recently C, and is available from:
[Snob][2003], [Snob][2001].
(NB. So called "vanilla" Snob implements multi- state and normal dist'ns as of '04.)

SNOB is Chris Wallace's computer program for unsupervised classification of multivariate data. The classification problem is sometimes called clustering, mixture modelling or numerical taxonomy. SNOB uses the Minimum [Message|Description] Length [Encoding] (MML|MDL) principle to decide upon the best classification of the data. MML encoding is a realisation of Ockham's razor.

SNOB is very efficient and can classify many tens of thousands of "things" quickly, where each thing can have tens of "attributes" (variables). An attribute can be continuous (real-valued) or discrete (multi-state).

A "class" is defined by distributions on one or more, but not necessarily all, attributes. The number of classes, the classes, their defining attributes and distributions, and class memberships are all inferred by SNOB.

Chris wrote a later, more powerful version, 'Factor Snob', which includes hierarchical classes, and factor analysis.

Selected Bibliography:

C. S. Wallace. Statistical and Inductive Inference by Minimum Message Length. Springer, 2005.
C. S. Wallace. Vanilla Snob. 2002 [www]['02].
C. S. Wallace. Classification by Minimum-Message-Length Inference. Advances in Computing and Information - ICCI '90. Springer Verlag LNCS 468 pp.72-81, 1990.
C. S. Wallace & P. R. Freeman. Estimation and Inference by Compact Encoding. J. R. Stat. Soc. B 49 pp.240-265, 1987.
D. M. Boulton & C. S. Wallace. An Information Measure for Hierarchic Classification. Computer Journal 16(3) pp.254-261, 1973.
D. M. Boulton & C. S. Wallace. A Program for Numerical Classification. Computer Journal 13(1) pp.63-69, 1970.
C. S. Wallace & D. M. Boulton. An Information Measure for Classification. Computer Journal 11 pp.185-195, 1968.
L. Allison. Models for machine learning and data mining in functional programming. J. Functional Programming, 15(1), pp.15-32, Jan. 2005.
Includes the source code of the core expectation-maximization (EM) algorithm for clustering.
© L.A. / 1994, 1999, 2000, 2001, 2002, 2003, 2005, 2011
www:


© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Created with "vi (Linux or Solaris)",  charset=iso-8859-1,  fetched Tuesday, 25-Nov-2014 06:44:11 AEDT.

free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop,
Firefox web-browser, FlashBlock flash on/off.