November 27, 2020

I, Science

The science magazine of Imperial College

Mutations are ticks on the molecular clock, and can now be used to measure evolutionary distances

1471-2148-11-296-1 Phylogeny Figures 14221117141_0874364df1_oCuckoo, pocket, analogue, digital – clocks and watches come in a variety of shapes and sizes, but all have the common purpose of allowing us to measure time. But what if we wanted to count for more than just a few seconds, minutes, or hours? Many of the speciation events that gave rise to the diversity of life we see today occurred millions of years ago. Fortunately nature has its own way of documenting time, called the molecular clock. This has allowed scientists to measure time over evolutionary scales, revealing the ages of different lineages on the tree of life.

The idea of the molecular clock was first formulated in the 1960’s when two scientists, Pauling and Zuckerkandl, compared the amino acid sequences of the haemoglobin proteins, found in the blood, taken from closely related species. They found that the number of differences occurring between the sequences closely correlated with time since divergence from a common ancestor, as estimated from the fossil record.

This observation led to the idea that, if the rate of molecular evolution as measured by the substitution of nucleotide bases in a genome or the amino acids in a protein was relatively constant over time, then the total number of sequence differences could give a reliable estimate of the relative ages of lineages being compared. A greater number of substitutions would mean older lineages. Therefore, unlike a stopwatch which counts in seconds, each tick of the molecular clock is a genetic mutation.

The increasing availability of genomic sequences now allows us to carry out a greater number of such comparisons. To give an absolute estimate of time in years since divergence though, the molecular clock also needs to be calibrated. This can be done through the use of dated fossils, which gives a minimum or maximum time for divergence estimates or, more recently, dated genomic sequences from ancient or rapidly evolving samples. For example, samples of the HIV virus have recently been used to date the emergence of the main epidemic strain (HIV-1) to the first half of the 20th century. This information also helped to locate the likely origin of the epidemic, believed to be Kinshasa in the Democratic Republic of Congo, and the factors involved in its spread.

However as genomic data has increased, it has also become apparent that one of the major assumptions of the molecular clock may not hold true. That is, that the rate of molecular evolution may not be constant through time after all.

Chris Beckett Neon drawing human WD repeat domain protein 5 355781233_b1afcf9b52_oThe original clock recognised that different genes may evolve at different rates within a genome. This will in part depend upon whether mutations are likely to alter a protein’s function and whether this has a negative impact on an organism’s fitness. For example, the sequences of histone proteins, which interact closely with DNA in order to package and stabilise it, are often highly conserved and thus evolve slowly due to the specific structure required for their function. This variation between genes can prove useful, since different genes can be used to examine events over different timescales.

However, more seriously, the rate of molecular evolution can also differ between species, and the mechanisms underlying this variation are only just beginning to be understood. For example, rodents are suggested to have a faster nucleotide substitution rate than humans because of sloppier DNA repair mechanisms, whilst in other species metabolic rates and generation times may also effect molecular evolution rates. These lineage effects may also differ amongst genes, for example endosymbiotic bacteria often have many non-functional genes as they become reliant on the host for certain gene products. These “pseudogenes” may therefore accumulate mutations at a greater rate than in free-living, related species with no effect on fitness.

Such discrepancy can lead to significant amounts of error in our estimates of time and have resulted in considerable controversy over the usefulness of the clock. New variations have been developed – for example the strict clock, the pacemaker model and the relaxed clock – all of which account for genetic and lineage effects to different extents. However, there is considerable debate over which model is the most suitable as each one can produce wildly different results. This model variation comes on top of the fact that inaccurate calibration points and phylogenetic trees can also contribute to the error surrounding estimates.

Should we trust the clock then? The molecular clock has, and will continue to be, an important tool for counting back in time to assist in the identification of interesting aspects of the evolutionary process. It has already proved its use in a wide range of disciplines from palaeontology to virology. However challenges for the future include investigating mechanisms underlying rate variation, building models that predict this and incorporating variations into new clocks so as to provide increasingly reliable estimates.

This is online-only content relating to the Autumn 2014 iScience magazine #29 themed on ‘Time’.  Read the magazine; other online-only content.  Images from Flickr under Creative Commons license: (Top: 1471-2148-11-296-1 by Phylogeny Figures; Bottom:Neon drawing: human WD repeat domain protein 5 (WDR5) by Chris Beckett)