
 [University Rankings]



1932 'Brave New World',
1949 'Nineteen Eighty Four',
1990s Great Firewall of China,
2010+ Wikileaks,
2013 Prism etc.
(NSA, awareness of).

 A case of 2 × 2 ≠ 2 ⊕ 2 :
A dataset, D, consists of N pairs.
A pair contains two binary (boolean) variables, (X, Y).
D's cases 
Y 

T  F 
X 
T 
p=#(T,T)  q=#(T,F)  v=p+q 
N=v+w 
F 
r=#(F,T)  s=#(F,F)  w=r+s 
 Encode the dataset using two different methods. Note that
each method can take advantage of positive or negative correlation
between X and Y (dare one say of causality, X→Y?).
 (i) Encode the data under a 4state distribution^{†},
{1:(T,T), 2:(T,F), 3:(F,T), 4:(F,F)}:
 pr_{1}(D) = p! q! r! s! 3! / (N+3)!
(code_length_{1}
= log_{2} pr_{1} bits.)
 (ii) Encode X as a 2state distribution, and
Y as one of two 2states,
one for each case of X:
 pr_{2}(D)
= {v! w! / (N+1)!}
{p! q! / (v+1)!} {r! s! / (w+1)!}
= p! q! r! s! / {(N+1)! (v+1) (w+1)}
 So,
pr_{1} / pr_{2}
= 3! (v+1) (w+1) / ((N+2) (N+3))
< 6 / 4,
and pr_{2} / pr_{1} < (N+3) / 6.
 If v ≈ w, method (i) has the shorter code length,
about log_{2}1.5 less than that of method (ii)
 at most just a fraction of a bit less
for the entire dataset.
But if v/N→1 say, w/N→0,
then pr_{2} > pr_{1},
and method (ii)'s code length is
up to roughly log_{2}N bits shorter for the dataset
(unbounded per data set, but < log_{2}(N)/N per datum).
(Of course, similar considerations also apply
to (ii') Y; (XY).)
 So why are the probabilities in (i) and (ii) different?
Method (i) assumes a uniform prior over the
4state's three parameters, ⟨pr((T,T)), pr((T,F)), pr((F,T))⟩.
Method (ii) assumes a uniform prior over the
parameter, pr(X=T), of X's 2state, and a
uniform prior on the parameter of each of Y's 2states,
pr(Y=TX=T) and pr(Y=TX=F).
These are subtly different assumptions.
 ^{†}(Recall that the adaptive code
(Boulton & Wallace
[1969],
[MML])
transmits a dataset of kstate values,
[1..k]^{N}, in
log_{2}((#1! ... #k! (k1)!) / (N+k1)!) bits.
It is optimal for a uniform prior; for a nonuniform prior
initialise the "counters" to values other than one.)

 From observation at the local lake,
about half of the birds are coots,
30% are ducks, and
20% are swans.

Most ducks and swans, say 90%, have been seen to waddle.
No coot has been seen to waddle (but maybe one could),
pr(B waddlesB is a coot) = 0.1, say.

Most ducks, say 90%, have been heard to quack.
No coot has been heard quacking,
pr(B quacks B is a coot) = 0.1, say.
Similarly for swans.

Someone reports that a certain bird, X, was observed to waddle and to quack.
What species, S, is X?

pr(B is a SB waddles & B quacks)
∝ pr(B is a S) . pr(B waddlesB is a S)
. pr(B quacksB is a S),^{†}

pr(X is a coot) ∝ 0.5 × 0.1 × 0.1 = 0.005,
pr(X is a duck) ∝ 0.3 × 0.9 × 0.9 = 0.243,
pr(X is a swan) ∝ 0.2 × 0.9 × 0.1 = 0.018,
total 0.005 + 0.243 + 0.018 = 0.266.

pr(X is a duck  X waddles, X quacks) = 0.243 / 0.266 = 0.91
 if it walks like a duck and talks like a duck it is
(probably) a duck, according to naive Bayes.
(Bayes because of the use of
Bayes's
theorem^{†}, and
naive because waddling and quacking are assumed to be independent.)

 The Federal Court
of .au ruled [FCA 65]
against 'Cancer Voices [.au],' and
for "USbased company Myriad Genetics and
Melbournebased Genetic Technologies, over the
patent on a breast and ovarian cancer gene known as BRCA1 ...
Justice John Nicholas ruled that the gene could be patented, as it
had been isolated completely separately from the human body.",
 [abc][15/2/2013].
Also see
FCA65@austlii [www][2/2013].
A pity, I think.
 13^{ }June 2013:
Good news and worse news?
"... we hold that a naturally occurring DNA
segment is a product of nature and not patent eligible
merely because it has been isolated, but that cDNA [complementary DNA]
is patent eligible because it is not naturally occurring. ..."
 Justice Clarence Thomas,
[supremecourt.gov][13/6/2013] (No. 12398).
(Also see [bbc],
[the G.].)
 7^{ }October 2015, not patentable:
"The [High] court [of Australia] found that
while the discovery of the [BRCA1] gene was a product of human action,
to consider it an invention would stretch the law too far."
 [abc][7/10/2015].

 The International Table Soccer Federation
[ITSF]
(i.e., foosball) has
[rules]
and videos of past championships
[www] online.

 Have finally disentangled the mathematics in the
various meandering explanations of the
von Mises  Fisher
probability distribution on directions in R^{D}
and of MMLing it.

 In some cultures sons are valued more than daughters and the
male:female sex ratio at birth is
much higher than one (ultrasound, abortion, ...);
the ratio is reported to be
as high as 1.19:1 in China (WDB).
Fisher (1930)
showed that natural selection drives
the ratio to 1:1 :
Every child has one mother and one father.
If there is an excess of males, a male has a lower chance of having
children than a female. (And v.v. if there is an excess of females.)
So, someone having a daughter in such a culture is more likely
to have grandchildren than someone having a son.
A tendency to have daughters is
being selected for. Just give nature time.
 (Note,
selection drives the ratio at reproductive age to 1:1.
The argument does not hold for all species, e.g.,
where females have multiple young, over time, after a single mating, say.
Search for [sex ratio biology] in
the [Bib].)

 The alien computer design in
A for Andromeda
(1961) still looks more than a match for a human
in terms of neuron numbers, but not in synapses;
there again, there's the matter of speed.

 The
stable marriage problem
featured in the 2012 Nobel prize for Economics.

 Dilbert
is a documentary.

 The good old
Iterated Prisoners' Dilemma (IPD).

 Enumerating all sequences of n pairs of
matched brackets
is equivalent to generating rooted, ordered, kary trees.

 The
Jacobi
algorithm finds Eigen things of a real, symmetric matrix.

 I really wish I had invented the
Burrows Wheeler
transform, in which case it would not be known as the BWT.
 >next> &
>curr.>

