Ordered Discrete Data 

If the frequency counts for the categories are quite large, the data can be modelled successfully by a multistate distribution. If the frequency counts are small, i.e. the amount of data is small for the number of categories, the cost of stating all the parameters of an unconstrained multistate distribution may be prohibitive. This can cause a problem, e.g. in decision trees where the amount of data arriving at a particular leaf can be small. Best is to keep and make use of the fact that the data are ordered (class Ord in Haskell), discrete which can be used to advantage, e.g. in [partitioning] such dataspaces. PriorsThe key question is
what prior should be placed on distributions for ordered discrete data,
i.e. what kind of distribution should be favoured?
It is clear that, given sufficiently convincing data, either
UnimodalGiven M categories, the largest probability is say T_{i}, 1<=i<M. Given that, T_{1}, ..., T_{i1} must be nondecreasing, and T_{i+1}, ..., T_{M} must be nonincreasing.
Message lengthA reasonable approach is to code the parameters of a unimodal distribution by the method used for the (unconstrained) multistate distribution. There is some slight inefficiency if the probabilities of two adjacent categories are close in value with respect to their uncertainties. Estimator (search)There is unlikely to be a closed form for the MML estimator for a unimodal distribution. However a constrained search of the unimodal region should not be too difficult. A "smoothed" distribution derived from the obvious unconstrained Mstate estimate may provide a good starting point.  LA, CSW, 26/3/'02
The `uni1' code is based on counting
frequencies of letters, from the left,
while forcing the frequency counts to remain unimodal at all times.
The `uni2' code is better, based on the unordered MML code,
smoothed to make it unimodal if necessary.


↑ © L. Allison, www.allisons.org/ll/ (or as otherwise indicated). Created with "vi (Linux)", charset=iso88591, fetched Monday, 22Jul2024 23:27:55 UTC. Free: Linux, Ubuntu operatingsys, OpenOffice officesuite, The GIMP ~photoshop, Firefox webbrowser, FlashBlock flash on/off. 