Integer Distributions 

This section considers positive integers, n≥1, unless otherwise stated (if you have nonnegative integers, n≥0, just add one first.) This is the first, and perhaps the most fundamental, infinite dataspace. A code over positive integers can be used to transmit data from any enumerable dataspace. The Geometric and Poisson distributions are examples of parameterised probability distributions; the probability distributions below have no parameters. 1/(n(n+1))A parameterless probability distribution
for
The expectation of the distribution is infinite:
1/nNote that P(n) ~ 1/n cannot be a proper probability distribution because
Log_{2}^{*} and RelativesIf you know that an integer, n, lies in the interval [1,N] (or in [0,N1]) then it can be encoded in log_{2}(N) bits, (and this is an optimal code if the probability distribution is uniform). What to do when there is no such bound N? Obviously transmit the length of the code word for n first. But how to transmit the length? Transmit its length first, of course! A sound code can in fact be based on this intuitive idea; note that log^{k}(n) decreases very rapidly as k increases. The leading bit of n is necessarily “1” so there is no need to transmit it, except that it can be used as a flag to determine whether the current value is a length or the final value of n proper; lengths are thus given a leading “0”. Such a prefix code can be used to code integers of arbitrary size. Unfortunately the length of a code word as a function of n is neither convex nor smooth although it is monotonic increasing^{+}.
The code above is valid, but not at all efficient. In fact we can do better, i.e., achieve a nonredundant code, by using lengths minus one:
The probability distribution P(n) has an infinite expectation: the probability of n is greater than under the 1/(n.(n+1)) distribution (which has an infinite expectation) for large n. Use the HTML FORM below to encode an integer 'n': Rissanen (1983) gives, r(n),
Notes
Wallace Tree CodeThe definition of (strict) binary trees is:
Note, a codeword always contains one more zero than it has ones. This allows the end of a code word to be detected. It also allows the word to be decoded. Note for example that '1' and '10' are not code words. The code is efficient, that is to say the sum over all words, w, of 2^{w} is one. The code is equivalent to giving a tree with code w a probability of 2^{w}: This would be difficult to prove combinatorially, but consider all infinitely long strings over {0,1}. We make them all equally probable in the sense that if they are truncated after kbits, then the 2^{k} truncated strings are all equally likely, i.e., of probability 1/2^{k}, for all k. The sum of the probabilities of the truncated strings of an arbitrary length k is clearly 1. Now 0 (pr=0.5) and 1 (pr=0.5) can be taken as the steps in a random walk. It is well known that a onedimensional random walk returns to the starting point with probability 1.0. Taking the set of infinitely long strings over {0,1}, almost every one of them, i.e., all but a subset having total probability 0.0, will at some point have one more '0' than '1's. Truncate each such string at the first such point. The probability of such a truncated string, w, is 1/2^{w}. This also equals the sum of the probabilities of all of its infinite extensions. i.e., the sum of the probabilities of all such code words is 1.0. This can be used as the basis of a code for positive integers: Enumerate the code words in order of increasing length and within that, for a given length, lexicographically say:
We see that the codeword lengths increase in smaller and more regular jumps than is the case for the log^{*} code. Notes


↑ © L. Allison, www.allisons.org/ll/ (or as otherwise indicated). Created with "vi (Linux)", charset=iso88591, fetched Wednesday, 19Sep2018 09:44:51 EDT. Free: Linux, Ubuntu operatingsys, OpenOffice officesuite, The GIMP ~photoshop, Firefox webbrowser, FlashBlock flash on/off. 