Discovery of the Genetic Code
The problem here is that there are only four different bases in the genetic code and 20 amino acids need to be synthesized. Two bases would not be enough because 42 = 16. 43 = 64, which is far more than 20, meaning that many amino acids can be specified by more than one codon – a degenerate code.
The code could be overlapping, eg. ABCDEFGHIJK, where ABC codes for one amino acid and BCD codes for the next, etc. , but in fact it is non-overlapping, ABC coding for the first amino acid, then DEF, GHI etc. A number of experiments revealed this.
In experiments on bacteriophage T4, Crick and Brenner found that a single deletion mutation could cause a protein to lose its function entirely, but a second mutation, where a nucleotide was inserted somewhere nearby caused the gene to function again. They concluded that the gene is read sequentially starting from a fixed point. Insertion or deletion ahift the reading frame (grouping) in which the nucleotides are read as codons, so they are called frameshift mutations.
In further experiment it was found that two deletions or insertions still result in deactivation but that three deletions or insertions restore gene function. This shows that there is a triplet code.
Consider a sentence of three letter words: THE BIG RED FOX ATE THE EGG, with the spaces having no physical significance, just indicating the reading frame. Deleting the 4 th base destroys the meaning of the sentence: THE IGR EDF OXA TET HEE GG so that all codons after the deletion specify the wrong amino acids.
But an insertion at the ninth position restores the reading frame: THE IGR EDX FOX ATE THE EGG. Such a sentence might still be intelligible particularly if the two mutations are close together. However two insertions or two deletions would not result in restoration of reading frame, although three would eg: THE BXI GYR EDZ FOX ATE THE EGG. Since any gene may have three possible reading frames it would be possible for the polynucleotide to code for two or three polypeptides overlapping each other and indeed some viruses do this, because they are particularly short of space.
In the 1960s the technologies for making mRNA sequences were fairly limited. They linked polynucleotides at random using a bacterial enzyme. However poly (U); poly (A); poly (C) and (G) could all be made, and when added (separately to tell which makes which polypeptide) to cell homogenate with ATP and GTP and amino acids, with DNAase having shut off the production of the cell’s own mRNA, the system synthesises a polypeptide; ie. Poly phe for poly (U) mRNA; poly Lys for poly (A); poly Gly -> poly (G) ; poly Pro form poly(C), identifying AAA as a codon for Lys etc.
In the absence of GTP, which is necessary for translation, trinucleotides are almost as effective at binding specific tRNAs to ribosomes. Ribosomes including tRNAs are retained by a nitrocellulose filter, but unbound tRNAs are not. The bound tRNAs, with their amino acids attached were then identified with specifically radioactively labelled amino acids. e.g. It was found that UUU makes phe-tRNA. UUG causes binding of Leu tRNA; UGU – Cys; GUU – Val, so UUG is the codon for Leu etc. In this way 50 codons were identified. For the remaining codons the binding assay was ambiguous or negative (no binding (e.g. for what were later found to be stop codons)). The dictionary was completed by H. Gobind Khorana, who made mRNAs with specific repeating sequences, e.g. UCUCUCUCUC reads UCU-CUC-UCU…, which is translated as Ser-Leu-Ser-Leu… Sequences of three repeating bases eg. poly UAC specify three different homopolypeptides because the ribosome may initiate synthesis at any part of the mRNA.
|