NUCLEOTIDES

The DNA is a long chain of building blocks, also called nucleotides (or bases). The nucleotides in the DNA are:

  • Adenine (A),
  • Cytosine (C),
  • Guanine (G) and
  • Thymine (T).

At first sight, the DNA seems to be a random sequence of As, Cs, Gs and Ts, but the order of these letters is highly important. In fact, the order of the letters defines the genetic code and determines whether a good protein is produced.

BASE PAIRS

Each nucleotide forms a pair (hydrogen bond) with its opposite nucleotide. The As are always paired with the Ts and the Cs with the Gs.

A  C  G        G  A  T       G  C  A

||  ||  ||          ||  ||  ||        ||  ||  ||

T  G  C         C  T  A       C  G  T

TRIPLET FORMS A CODON

3 nucleotides together form 1 triplet or a codon and define the coding for 1 amino acid. The 4 nucleotides (bases) can form 64 different codons.

In the article ‘From DNA to protein’ we explained that the coding from DNA into protein is done in 3 steps. From copying (transcription) to splicing (non-coding introns are cut out) to conversion (translation). The conversion/translation to mRNA eventually forms the code for the production of the protein.

A schematic figure of the DNA with an open reading frame from a start through to a stop codon. The figure comes from the National Human Genome Research Institute

SIGNAL CODONS

For a smooth running of the process of transcription, splicing and translation, signals must have been stored in the DNA indicating the start of a gene (DNA fragment) that contains coding material for a specific protein. In other words, these signals indicate from which moment the process of transcription, splicing and conversion/translation of the DNA must start and stop again. These signals are formed by specific codons of a series of codons (triplets of nucleotides).

In this way the codon consisting of the nucleotides ATG gives a start signal at the conversion/translation of the piece of gene to the mRNA.

The codons consisting of the nucleotide combinations TAA, TGA and TAG are called the stop codons. They indicate when the translation must stop.

AMINO ACIDS

The codons (triplets of 3 nucleotides) are read (transcription) and translated (conversion) into amino acids, which are strung together to form a protein. An incorrect order of the nucleotides results in a faulty code and directly affects the production and the functioning of a protein.
For example:

1the codon TGT represents the amino acid Cysteine. If the first T was changed into a C, there will be a codon CGT. This codon represents the amino acid Arginine. Therefore, instead of the amino acid Cysteine the amino acid Arginine will be built into the protein and this may have far-reaching consequences.

2Another possibility is that because of a mutation (change) an extra nucleotide (base) is put into a gene (insertion) or that a nucleotide is deleted (deletion). Then the multiple of 3 letters per codon has gone and the ‘reading frame’ shifts (frame shift). Incorrect codons will be read until a stop cocon is created.

READING FRAME SHIFTS

A reading frame is the way in which not overlapping triplets/codons are read. The reading frame can shift as a result of a mutation (change). We call this a frame shift. The consequence of this is that incorrect codons are read and a stop cocon interrupts the production of the protein halfway the translation.

1 a frame shift mutation caused by a deletion:

~~~~GCC TTC GAG TTC CAC TGC CTA AGT~~~~

The deletion removes C and the codons shift 1 position to the left

~~~~GCC TTG AGT TCC ACT GCC TAA GT ~~~~

6 codons ahead a stop codon (TAA) is suddenly created

2 As codon TCG changes into TAG, this will become a stop codon, which will prematurely end the transcription.

In short, a change of a nucleotide may not only lead to the production of an incorrect and different amino acid, but a stop cocon can be created as well.

GENETIC OUTCOME

If you have been diagnosed for Usher Syndrome, you have usually first undergone a DNA test. The DNA outcome mentions the place (= notation) of the change (= mutation) in the DNA and, consequently, in the RNA. The notation ‘c.’ means that a notation on the ‘coding’ sequence, being the DNA, is involved. If the consequence at protein level is involved, the notation ‘p.’ will be put in front (e.g. p.(Cys759Phe)).

Examples:

1

600delC         means that the 600th nucleotide (base) C has been removed (del = deletion)

2

1645insC          means that behind the 1645th nucleotide (base) an extra C was added (ins = insertion)

3

c.2276G>T     on position 2276 a ‘G’ (= Guanine) was changed into a T’ (= Thymine)