|
- Often, P(x|p) = (1-p)x-1 . p,
integer x≥1,
μ=1/p,
μ≥1, but
- here, P(x|p) = (1-p)x . p,
integer x≥0,
μ=(1/p)-1,
μ≥0,
p=1/(μ+1),
1-p=μ/(μ+1).
-
- In μ-space:
- p = 1/(μ+1), so
- P(x|μ) = (1 - 1/(μ+1))x / (μ+1)
- = (μ / (μ+1))x / (μ+1)
-
- Given n data, x1, ..., xn, the likelihood
- = P(x1, ..., xn | μ)
= (μ / (μ+1))∑xi / (μ+1)n
-
- neg log likelihood
- L =
(∑xi).(log(μ+1) - log μ) + n.log(μ+1)
- 1st derivative
- d L / d μ =
(∑xi).(1/(μ+1) - 1/μ) + n/(μ+1)
-
If we equate this to zero,
(∑xi).μ - (∑xi).(μ+1) + n.μ = 0,
μmaxLH = (∑xi) / n.
|
- 2nd derivative
- d2 L / d μ2 =
(∑xi).(1/μ2 - 1/(μ+1)2) - n/(μ+1)2
- which has expectation, i.e.,
Fisher information, Fμ
- = n.μ.(1/μ2 - 1/(μ+1)2) - n/(μ+1)2
- = n/μ - n.μ/(μ+1)2 - n/(μ+1)2
- = n.(1/μ - 1/(μ+1))
- = n / (μ (μ+1))
-
- Assume prior, h μ = (1/A).e-μ/A,
which has mean A.
-
- The two-part message length, m
- = - log(h μ) + L + (1/2)log Fμ + (-log 12 + 1)/2
- = log A + μ/A
- (∑xi).log(μ/(μ+1)) + n.log(μ+1)
+ (1/2)log n - (1/2)logμ - (1/2)log(μ+1)
+ (-log 12 + 1)/2
-
- To estimate μ, differentiate m with respect to μ
- d m / d μ
- = 1/A
+ (∑xi).{1/(μ+1) - 1/μ} + n/(μ+1)
- 1/(2μ) - 1/(2(μ+1))
- = 1/A + (1/(μ+1)).{∑xi + n - 1/2} - (1/μ).{∑xi + 1/2}
- equate to zero, multiply by μ(μ+1)
- 0 = μ(μ+1)/A + μ{∑xi + n - 1/2} - (μ+1){∑xi + 1/2}
- = μ2/A + μ{1/A + n - 1} - 1/2 - ∑xi
- (Note that if A is "very large",
μMML
= (∑xi + 1/2) / (n - 1).)
- The quadratic has solutions
- μMML
= (1 - n - 1/A
±√{n2 + 1/A2 + 1 + 2n/A - 2/A - 2n + 2/A + 4(∑xi)/A})
/ (2/A)
- = (1 - n - 1/A
±√{n2 + 1/A2 + 1 + 2n/A - 2n + 4(∑xi)/A})
/ (2/A)
- only the "+" solution is admissible.
-- L.A., July 2007.
Thanks to Daniel Schmidt and Enes Makalic.
See [IP 1.2] for an implementation.
|
|