|
- The von Mises - Fisher (vMF) distribution is a probability distribution
on directions in RD.
It is natural to think of it as a distribution on the
(D-1)-sphere of unit radius,
that is on the surface of the D-ball of unit radius.
-
- The von Mises - Fisher's probability density function is
- pdf(v | μ, κ) = CD eκμ.v
- where datum v is a normalised D-vector,
equivalently a point on the (D-1)-sphere,
- mu, μ, is the mean (a normalised vector), and
- kappa, κ ≥ 0, is the concentration parameter (a scalar).
- The distribution's normalising constant
- CD(κ) =
κD/2-1
/ {(2π)D/2 ID/2-1(κ)}
- where Iorder(.) is the
"modified Bessel function of the first kind"!
- In the special case that D = 3,
- C3(κ) = κ
/ {2π (eκ - e-κ)}
-
- The negative log pdf is
- - log pdf(v | μ, κ)
= - log CD - κ μ . v,
- and
-
log CD = (D/2-1)log κ
- (D/2)log 2π
- log ID/2-1(κ).
-
- Given data
{v0, ..., vN-1}, define their sum
(a D-vector),
- R = ∑i=0..N-1 vi,
- and
- Rbar = ||R|| / N.
-
-
The negative log likelihood is
- - logLH
= - N log CD - κ μ . R.
- It is obvious that the maximum likelihood estimate of μ is
R normalised,
- μML = R / ||R||,
-
and that the MML estimate is the same,
- μMML
= μML
= R / ||R||,
- the most general prior for μ being the uniform distribution.
-
- For given μ and κ,
the expected value of Rbar equals
-
AD(κ) =
ID/2(κ) / ID/2-1(κ),
- and the (less obvious) maximum likelihood estimate of κ is
- κML = A-1(Rbar).
- This is because
- ∂/∂κ - logLH =
- N {∂/∂κ
log CD(κ)}
- μ . R
- which is zero if
- - ∂/∂κ
log CD(κ) = μ . R / N,
- where
- ∂/∂κ log CD(κ)
- = ω / κ
- I'ω(κ)
/ Iω(κ),
where ω = D/2 - 1
- = ω
{Iω(κ)
- κ/ω
I'ω(κ)}
/ (κ Iω(κ))
- = ω
{κ/2ω
{Iω-1(κ)
- Iω+1(κ)}
- κ/2ω
{Iω-1(κ)
+ Iω+1(κ)}}
/ (κ Iω(κ))
- = - ID/2(κ) / ID/2-1(κ),
- using the "well known" relations,
- Iν(z)
= z/2ν
{Iν-1(z) - Iν+1(z)},
- and
- I'ν(z)
= 1/2
{Iν-1(z) + Iν+1(z)},
(I'0(z)
= I1(z)).
-
- The MML estimate, κMML,
is the value that minimises the two-part
message length;
no closed form is known for κMML.
The message length calculations also require
a choice of prior for κ, and
the vMF's Fisher information, F.
-
- The Fisher information of the vMF distribution.
- The expected second derivative of
- logLH w.r.t. κ is
- ∂2/∂κ2
- logLH
= N A'D(κ).
- The vMF distribution is symmetric about μ on the (D-1)-sphere;
there is no preferred orientation around μ.
A direction, such as μ, has D - 1 degrees of freedom.
The expected 2nd derivative of - logLH w.r.t. any one of
μ's degrees of freedom is
- N κ AD(κ).
- This is for the following reason:
- Without loss of generality, let
μ = (1, 0, ...), and then
μ → (cos δ, sin δ, 0, ...), say,
where δ is small,
- ∂/∂δ
- logLH = N κ ||R|| sin δ,
- ∂2/∂δ2
- logLH
= N κ ||R|| cos δ
≈ N κ ||R||, as δ is small
- which is
- N κ AD(κ) in expectation.
- Symmetry implies that the off-diagonal elements for μ are zero.
And, μ is a position parameter and κ a scale parameter,
so the off-diagonal elements between μ and κ are also zero.
- F, the Fisher information of the vMF is therefore,
-
F =
ND
(κ AD(κ))D-1
A'D(κ).
-
- Sources
- Search for [vonMises direction] in the
[Bib], and
- see section 6.5, p.266 of
Wallace's
book (2005).
-
- P. Kasarapu & L. Allison,
Minimum message length estimation of mixtures of multivariate Gaussian
and von Mises-Fisher distributions,
Machine Learning (Springer Verlag),
March 2015
[click].
-
- The special case of the probability distribution
where D = 2 is known as the
von Mises
distribution for directions in R2,
that is for angles and periodic quantitites such as annual events.
|
|