|
|
KL-distance
from Nμ1,σ1
to Nμ2,σ2
(Also known as KL-divergence.)
- The general form is
-
- ∫x {
pdf1(x).{ log(pdf1(x)) - log(pdf2(x)) }}
-
- we have two normals so pdf1(x) is
Nμ1,σ1(x), etc..
-
- = ∫x
Nμ1,σ1(x).{
log(Nμ1,σ1(x))
-
log(Nμ2,σ2(x))
}
-
- = ∫x
Nμ1,σ1(x).{
(1/2)(
- ((x-μ1)/σ1)2
+ ((x-μ2)/σ2)2
)
+ ln(σ2/σ1)
}
-
- can replace x with x+μ1.
The expected value of x2 is σ12.
Terms that are odd in x, and otherwise
symmetric about zero, cancel out over [-∞,∞]
leaving the ...x2 and ...constant terms.
-
- = (1/2){
- (σ1/σ1)2
+ (σ1/σ2)2
+ ((μ1-μ2)/σ2)2
}
+ ln(σ2/σ1)
-
- = {
(μ1 - μ2)2
+ σ12
- σ22
} / (2.σ22)
+ ln(σ2/σ1)
-
- This is zero if
&mu1=&mu2 and
&sigma1=&sigma2.
It obviously increases with |&mu1-&mu2| and
has rather complex behaviour with
&sigma1 and &sigma2
(and is consistent P&R, with KL2 in S,J,R&S, and
with J&S where &sigma1=&sigma2).
- KL(N(μq,σq) ||
N(μp,σp)), p.18 of
Penny & Roberts, PARG-00-12, 2000.
- Symmetric KL2:
KL2(N(μ1,σ1),
N(μ2,σ2))
= (μ1-μ2)2.
(1/σ12+1/σ22) +
σ12/σ22 +
σ22/σ12,
e.g. Siegler, Jain, Raj, Stern
[pdf].
- KL(N(μ1,σ), N(μ2,σ))
= (μ1-μ2)2/(2σ2),
Johnson & Sinanovic, NB. a common σ
[pdf].
- Note that the distance is convenient to integrate over, say, a range
of μ1 & σ1:
-
| ∫
| μ1max
| ∫
| σ1max
|
|
| | |
| μ1min
| σ1min
|
|
|
+ ln σ2
- 1/2
| +
|
|
- ln σ1
|
|
|
NB no σ1 here ...
|
|
... & no μ1
|
-
|
let |
f(&mu1) =
|
|
+ μ1 . (ln σ2 - 1/2)
|
|
| | and |
g(σ1) =
|
|
- σ1 . (ln σ1 - 1)
|
-
- =
(f(μ1max) - f(μ1min))
. (σ1max - σ1min)
+ (μ1max - μ1min)
. (g(σ1max) - g(σ1min))
|
|