2State and Binomial Distributions 

'A' is an event for which P(A)=p and hence P(not A)=1p. Consider 'n' independent trials (also known as Bernoulli trails). The random variable X is defined to be the number of times that A occurred. Then X has a binomial distribution with parameters n and p. The following is taken from Wallace & Boulton (1968) to follow the historical development of MML. The general MML form, explicitly based on the [Fisher] information, is given later. CodingTo transmit a sequence of 'n' cointosses containing 'h' heads, This is a multistate distribution with m=2. A uniform prior is assumed for P(H). The value of 'n' is either common knowledge or is transmitted by standard methods. The receiver does not know h or t. h = #heads, t = #tails, n = h+t r(H) = h/n = frequency of heads, r(T) = ... P(H) = probability of a head, P(T) =... P'(H) = estimate of P(H), P'(T) =... We describe three methods to transmit the sequence of
cointosses, from Wallace & Boulton (1968): Method 1
requires log(n!/(h!(nh)!)) bits. Method 2
eg. H T T H T T T H T T H count: 1 2 2 2 3 3 3 3 4 4 T count: 1 1 2 3 3 4 5 6 6 7 sum : 2 3 4 5 6 7 8 9 10 11 count 1 1 2 2 3 4 5 3 6 7 p =  =           sum 2 3 4 5 6 7 8 9 10 11 p = h!.(nh)!/(n+1)! msg len = log( (n+1)!/(h!(nh)!) ) So method 1 and method 2 require the same message length. Method 3
Total msg len = msglen(P'(H))  h.log(P'(H))  t.log(P'(T))
The values of P(H) range over G=[0,1] uniformly and a particular estimate will be used over some range of r(H). This range will depend on n and on P'(H).
Note that for a given a(H), D is large for P'(H) close to 0 or to 1, and also that D is symmetric for a(H) +ve or ve, so P'(H) should be at the centre of the range over which it is used. Average value of a(H)^{2} over the range [P'(H)x/2, P'(H)+x/2] of length x is
avgD = n . x^{2} / (24 . P'(H).(1P'(H)))The message length to state the interval [P'(H)x/2, P'(H)+x/2] and thus P'(H) is log(x)  'cos of the uniform prior in P(H)Total avg cost to transmit interval and also D due to mismatch of P'(H) to r(H) is log(x) + n . x^{2} /(24 . P'(H).(1P'(H)))to minimise avg increase, differentiate w.r.t. x and find zero 1/x + n . x / (12 . P'(H).(1P'(H))) = 0 x^{2} = 12 . P'(H) . (1P'(H)) / nSo min' increase is
To compare Method 3 with methods 1 and 2: Stirling's approx: log(x!) = x.log(x)  x + 1/2 log(x) + 1/2log(2 pi) + ... method 1 or 2: log( (n+1)!/(h! t!) ) = log(n+1) + log(n!/(h! t!)) = log(n+1) + n log n  n + 1/2 log n + 1/2 log(2 pi)  h log h + h  1/2 log h  1/2 log(2 pi)  t log t + t  1/2 log t  1/2 log(2 pi) = log(n+1)  h log(h/n)  t log(t/n) + 1/2(log n log h log t log(2 pi)) = 1/2(log((n+1)^{2} . n / (h t))  log(2 pi) )  h.log(h/n)  t.log(t/n) = 1/2{log((n+1)^{2}/n)  log(h/n)  log(t/n) log(2 pi)}  h.log ...... = 1/2{log((n+1)^{2}/n)log(2 pi)log(r(H))log(r(T))} h.log(h/n)t.log(t/n) method 3  method 1 = 1/2{log(n/12) + 1  Σ_{[m:HT]}log(r(m))} h.log(h/n)t.log(t/n) {1/2{log((n+1)^{2}/n)log(2 pi)log(r(H))log(r(T))} h.log(h/n)t.log(t/n)} = 1/2{2log(1+1/n) log(12)+1+log(2 pi)} ~ 1/2 { log(pi/6) + 1 }  for large n = 1/2 log(pi.e/6) = 0.176 nits So method 3 is just marginally(!) worse than
DemonstrationUse the HTML FORM below to generate a sequence of coin tosses for a specified pr(H) and length N. The 'code' button calculates code length for a fixed code (nominate an estimate p' for P(H)), combinatorial code, adaptive code, and a code which states pr(H) to an optimal level of accuracy. NB. the appoximations used may break down for very small values of N.
Don't worry if JavaScript gives a ridiculous number of decimal places, it's just its way. Method 4, sparse list
This result is relevant when considering the adjacencymatrix v. adjacencylists representations of graphs (networks). NotesThis page is directly based on the appendix in the paper Wallace & Boulton (1968); any errors by L.A..


↑ © L. Allison, www.allisons.org/ll/ (or as otherwise indicated). Created with "vi (Linux)", charset=iso88591, fetched Saturday, 17Aug2019 21:51:01 EDT. Free: Linux, Ubuntu operatingsys, OpenOffice officesuite, The GIMP ~photoshop, Firefox webbrowser, FlashBlock flash on/off. 