Paradigm

The Message Paradigm

Paradigm

Tracy T. Transmitter and Richard R. Receiver get together and select a set of hypotheses, {H₀, H₁, ... }, to describe data, and design a code book to transmit two-part messages, where each message consists of (i) an hypothesis and (ii) a data-set given the hypothesis. This allows T and R to write encoder and decoder programs P and P^-1. Naturally T and R want to use short code words in a message but, at this stage, any data are purely hypothetical and so they must design the code book based on expected data.

Then T and R move apart and the following happens . . .

T gets an actual data-set, D.
T chooses an H from the set.
T transmits H;D to R.

\|msgLen\| = \|part1\| + \|part2\|
part1: code(H)	part2: code(D\|H)

H;D←

decoder P^-1...

...is run on some UTM_R

←

encoder P...

...is run on some UTM_T

←H;D

R receives H;D.
R now knows the data-set, D,
& also T's opinion, H, of D.

UTM : A universal Turing machine.
Shannon, |code(X)| = -log(pr(X)), and
Bayes, |code(H&D)| = |code(H)| + |code(D|H)| = |code(D)| + |code(H|D)|,
give - log(pr(H|D)) ~ |code(H)| + |code(D|H)|.
The selection of {H₀, H₁, ... }, and the issue of what data each H_i best covers, must be considered together in the design of the code book.
Being very sensible, T will select an H that is a good model of D, but a less sensible individual might not and yet R could still recover D, although the message would be longer:: - log(pr(H_i|D) / pr(H_j|D)) = |code(H_i)|+|code(D|H_i)| - (|code(H_j)|+|code(D|H_j)|), -- negative log posterior-odds ratio.
Note, depending on the application area, a data-set could be a single thing, e.g., a genome.