The MML estimator for an M-state distribution gives
θi=(ni+1/2)/(N+M/2),
where ni is the number of observations of statei,
during a total of N observations,
N=∑i=1..M {ni}.
In general, the uncertainty region for the MML estimate
for k-parameters
θ =
⟨θ1,θ2,
...,θk⟩,
is approximately
√(12k/F(θ)),
where F(θ) is the
[Fisher information].
Note that k=M-1 for the multi-state distribution.
3-States, M=3
A 3-state source has
two parameters, θ1 and θ2 in [0,1];
i.e., θ=⟨θ1,θ2⟩.
It is convenient to define θ3, also in [0,1], where
θ3=1-θ1-θ2,
but θ3 is not a third (free) parameter.
We observe n1 occurrences of state1,
n2 of state2 and
n3 of state3 where
N=n1+n2+n3.
The likelihood,
LH = θ1n1.θ2n2.θ3n3,
so ...
d2/d θ1 d θ2 {-log LH} = n3/(1-θ1-θ2)2 = d2/d θ2 d θ1 {-log LH}
The expection of n1 over the data space
is N.θ1, similarly for n2 and n3,
so the Fisher is ...
| | | |
N/θ1+N/θ3
N/θ3
| | | |
N/θ3
N/θ2+N/θ3
=
N2
θ32
| | | |
(1-θ2)/θ1
1
| | | |
1
(1-θ1)/θ2
=
N2
. {
(1-θ2)
.
(1-θ1)
- 1 }
θ32
θ1
θ2
= (N2/θ32).((1-θ1-θ2)/(θ1 θ2) = N2/(θ1.θ2.θ3)
M-States
It can be shown
that for M-states, i.e., M-1 parameters, and probabilities
θ1, θ2, ..., θM-1, and
θM =
1-θ1-...-θM-1, θ =
⟨θ1,...,θM-1⟩,
that
F(θ) =
NM-1 /
(θ1.θ2...θM).
Use the HTML FORM below to generate a data sample
for specified probabilities and length N.
The 'code' button calculates message lengths for various codes.
Note, the appoximations used may break down for very small values of N.
Requirements:
1 ≤ M ≤ 10,|θ[1,M]| > 0 (will be normalised),
N ≥ 0, and
sample ∈ [0, M-1]N.
D. M. Boulton & C. S. Wallace.
The Information Content of a Multistate Distribution.
J. Theor. Biol.23 pp269-278, 1969.
When these papers were written there were
different notions of the information content of a sequence
in use in the literature. W&B showed that if the calculations were
done correctly and if all information was truly taken into account
then these notions gave essentially the same answer.
And a delightful piece of trivia about dice
via Dean McKenzie [7/1999]:
'Several decades ago, the Harvard statistician Frederick Mosteller had
an opportunity to test the [dice-tossing] model against the behavior of
real dice tossed by a real person. A man named Willard H. Longcor. who
had an obsession with throwing dice, came to him with an amazing offer
to record the results of millions of tosses. Mosteller accepted, and
some time later he received a large crate of big manilla envelopes, each
of which contained the results of twenty thousand tosses with a single
die and a written summary showing how many runs of different kinds had
occurred. "The only way to check the work was by checking the runs and
then comparing the results with theory," Mosteller recalls.
"It turned out [Longcor] was very accurate" Indeed, the results
even highlighted some errors in the then-standard theory of
the distribution of runs'. -
Peterson, I. (1998) The Jungles of Randomness. Penguin, London. pp7-8.
(originally published 1998 by Wiley, New York).