Genes, DNA->RNA->protein
| DNA {A,T,C,G} | |||||||||||||||||
| 5' [up stream] | [promoter(s)] | [exon1] | [gt intron1 ag] | [exon2] | [gt intron2 ag] | [exon3] | 3' [down stream] | ||||||||||
| transcribed to RNA {A,U,C,G} | |||||||||||||||||
| 5' | 3' | ||||||||||||||||
| [exon1] | [gu intron1 ag] | [exon2] | [gu intron2 ag] | [exon3] | |||||||||||||
| RNA spliced (edited) | |||||||||||||||||
| |||||||||||||||||
| translated to protein | |||||||||||||||||
| |||||||||||||||||
| |||||||||||
|
atg~aug->MET (& starts) | |||||||||||
|
Stop translation codons: taa~uaa, tag~uag, tga~uga | |||||||||||
|
UTS = UnTranslated Sequence | |||||||||||
Genetic Code
There are four DNA (RNA) bases {A,T(U),C,G}.
There are twenty (common) amino acids making up proteins.
mRNA is read in codons - groups of three bases -
and translated into protein according to
the genetic code below:
| position 2 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| pyrimidine | purine | |||||||||||
| U/T | C | A | G | |||||||||
| p o s i t i o n 1 |
p y r i m i d i n e |
U / T |
UUU | Phe[F] | UCU | Ser[S] | UAU | Tyr[Y] | UGU | Cys[C] | U | p o s i t i o n 3 |
| UUC | UCC | UAC | UGC | C | ||||||||
| UUA | Leu[L] | UCA | UAA | Stop! | UGA | Stop! | A | |||||
| UUG | UCG | UAG | UGG | Trp[W] | G | |||||||
| C | CUU | CCU | Pro[P] | CAU | His[H] | CGU | Arg[R] | U | ||||
| CUC | CCC | CAC | CGC | C | ||||||||
| CUA | CCA | CAA | Gln[Q] | CGA | A | |||||||
| CUG | CCG | CAG | CGG | G | ||||||||
| p u r i n e |
A | AUU | Ile[I] | ACU | Thr[T] | AAU | Asn[N] | AGU | Ser[S] | U | ||
| AUC | ACC | AAC | AGC | C | ||||||||
| AUA | ACA | AAA | Lys[K] | AGA | Arg[R] | A | ||||||
| AUG | Met[M] | ACG | AAG | AGG | G | |||||||
| G | GUU | Val[V] | GCU | Ala[A] | GAU | Asp[D] | GGU | Gly[G] | U | |||
| GUC | GCC | GAC | GGC | C | ||||||||
| GUA | GCA | GAA | Glu[E] | GGA | A | |||||||
| GUG | GCG | GAG | GGG | G | ||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| codes | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
There are three stop (translating) codons. All coding regions begin AUG (Met).
Any "sufficiently long" stretch of DNA, in some reading frame (offset of 0, 1 or 2), not containing a stop codon is called an open reading frame (ORF) and is a potential candidate for being a part of a gene.
AA Properties
| . | --hydrophylic-- | . | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| l a r g e |
. | . | K | Q | E D | . | . | s m a l l |
||
| . | H | R | N | . | . | . | ||||
| . | . | . | . | . | P | G | ||||
| W | . | . | * | T | S A | . | ||||
| . | . | M | . | . | . | . | ||||
| . | F L | . | I V | . | . | . | ||||
| . | Y | . | . | . | . | C | ||||
| . | --hydrophobic-- | . | ||||||||
| --Approx(!) AA similarity ~ Swanson 84-- | ||||||||||