- IEEE Data Compression Conference, pp.169-178, 1998
This paper describes a Minimum Message Length (MML) approach
to finding the most appropriate Hidden Markov Model (HMM) to
describe a given sequence of observations. A MML estimate for the
expected length of a two-part message stating a specific HMM and
the observations given this model is preseneted along with an effective
strategy for finding the best number of states for the model.
The information estimate enables two models with different numbers
of states to be fairly compared which is necessary if the search of this
complex model space is to avoid the worst locally optimal solutions.
The general purpose MML classifier `Snob' has been extended and the
new program `tSnob' is tested on `syntehetic' data and a large `real world'
dataset. The MML measure is found to be an improvement on
the Bayesian Information Criterion (BIC) and the un-supervised search