68 Sequencing DNA | Summary

Sequencing DNA

DNA sequencing is a technique that is used to unravel the order of the four nucleotide bases that comprise a DNA strand. Several methods have been developed for this process. These consist of four key steps. In the first instance DNA is removed from the cell. This can be done either mechanically or chemically. The second phase involves breaking up the DNA and inserting its pieces into vectors, cells that indefinitely self-replicate, for cloning. In the third phase the DNA clones are placed with a dye-labelled primer (a short stretch of DNA that promotes replication) into a thermal cycler, a machine which automatically raises and lowers the temperature to catalyse replication. The final phase consists of electrophoresis, whereby the DNA segments are placed in a gel and subjected to an electrical current which moves them. Originally the gel was placed on a slab, but today it is inserted into a very thin glass tube known as a capillary. When subjected to an electrical current the smaller nucleotides in the DNA move faster than the larger ones. Electrophoresis therefore helps sort out the DNA fragments by their size. The different nucleotide bases in the DNA fragments are identified by their dyes which are activated when they pass through a laser beam. All the information is fed into a computer and the DNA sequence displayed on a screen for analysis.

The two scientists in the photograph are reading the genetic code for a DNA sample on a highlighted light board. Such analysis is usually done by a computer. Credit: National Cancer Institute.


DNA sequencing is a method used to determine the precise order of the four nucleotide bases – adenine, guanine, cytosine and thymine - that make up a strand of DNA. These bases provide the underlying genetic basis (the genotype) for telling a cell what to do, where to go and what kind of cell to become (the phenotype). Nucleotides are not the only determinants of phenotypes, but are essential to their formation. Each individual and organism has a specific nucleotide base sequence.


DNA sequencing played a pivotal role in mapping out the human genome, completed in 2003, and is an essential tool for many basic and applied research applications today. It has for example provided an important tool for determining the thousands of nucleotide variations associated with specific genetic diseases, like Huntington's, which may help to better understand these diseases and advance treatment. DNA sequencing also underlies pharmacogenomics. This is a relatively new field which is leading the way to more personalised medicine. Pharmacogenomics studies how a person's individual genome variations affect their response to a drug. Such data is being used to determine which drug gives the best outcome in particular patients. Currently over 140 drugs approved by the FDA include pharmacogenomic information in their labelling. Such labelling is not only important in terms of matching patients to their most appropriate drug, but also for working out what their drug dose should be and their level of risk in terms of adverse events. Individual genetic profiling is already being used routinely to prescribe therapies for patients with HIV, breast cancer, lymphoblastic leukaemia and colon cancer and in the future will be used to tailor treatments for cardiovascular disease, cancer, asthma, Alzheimer's disease and depression. Drug developers are also using pharmacogenomic data to design drugs which can be targeted at subgroups of patients with specific genetic profiles.


While DNA was found to have a double helix structure in 1953, it was to take scientists many more years before they could analyse DNA fragments. In part this reflected the fact that small DNA molecules contain several thousands of nucleotides and it was difficult to obtain large quantities of homogeneous DNA. Scientists also lacked the means to degrade DNA which was important for sequence analysis. A foundation for sequencing DNA however was laid in the 1960s with the emergence of techniques to sequence ribonucleic acid (RNA)s. Ray Wu, a Chinese American biologist based at Cornell University, published one of the first methods for sequencing DNA in 1970. Using highly labelled deoxynucleotides (single units of DNA) and DNA polymerase he found a way to sequence the terminal region of a DNA molecule. Critically, Wu's approach broke the DNA sequence down into different components for analysis thereby circumventing the need for large quantities of homogeneous DNA. Subsequently, in 1971, Wu demonstrated his method could sequence the ends of DNA in lambda phage, and two years later that it had the capacity to determine the sequence of any DNA. Over the course of the 1970s Wu's method would be modified by Fred Sanger at the Laboratory of Molecular Biology in Cambridge, UK. In 1975 Sanger, together with Alan Coulson, published what became known as the 'Plus and Minus' technique. This enabled the sequencing of up to 80 nucleotides in one go. Three years later, in 1977, Sanger and his colleagues announced their development of another technique called the 'Sanger method' or 'dideoxy sequencing'. This made it possible to sequence much longer stretches of DNA very rapidly. Their approach appeared alongside the reporting of another technique by Allan Maxam and Walter Gilbert at Harvard University.

While the Maxam-Gilbert method initially proved highly popular it soon fell out of favour because it necessitated the use of hazardous chemicals and radioisotopes and its technical complexity prevented its use in standard molecular biology kits. It was also difficult to scale-up. By contrast, the Sanger method soon gained in popularity because it was easier to use and more reliable. It was also amenable to automation, paving the way to the first generation of DNA sequencers. The first automated DNA sequencer was devised in 1986 by Leroy Hood and colleagues at the California Institute of Technology together with a team including Lloyd Smith and Michael and Tim Hunkapiller. These machines used capiliary electrophoresis rather than gel electrophoresis using slabs. Several new DNA sequencing methods and machines have been developed since the 1990s. These were built following the introduction of microfluidic separation devices which improved sample injection and speeded up separation times. Such innovations improved both the efficiency and accuracy of sequencing, allowing for high-throughput sequencing, and radically lowered the cost. Between 2001 and 2011 the cost of sequencing a genome shrank from $100 million to $10,000.


DNA sequencing provides the means to know the how nucleotide bases are arranged in a piece of DNA. The method was pivotal to the international Human Genome Project. Costing over US$3 billion and taking 13 years to complete, ithis project provided the first complete Human DNA sequence in 2003. This data has provided for the first time a tool to map out the genetic mutations that underlie specific genetic diseases. It has also opened up a path to more personalised medicine, enabling scientists to examine the extent to which a patient's response to a drug is determined by their genetic profile. The genetic profile of a patient's tumour, for example, can now be used to work out what is the most effective treatment for an individual. It is also hoped that in the future the information from the human DNA sequence will provide a means to work out a person's predisposition to certain diseases, such as heart disease, cancer and type II diabetes, which could pave the way to better preventative care. The Human Genome Project is also helping fuel the development of gene therapy which is being used to replace defective genes in certain genetic disorgers. It is also being used for the design of drugs to target specific genes that cause disease. Another offshoot of DNA sequencing is genetic testing for paternity and other family relationships. It is also used to identify crime suspects and victims involved in a catastrophe. The technique is also being used to detect bacteria and other organisms that may pollute air, water, soil and food. In addition the method is important to the study of the evolution of different population groups and their migratory patterns as well as determining pedigree for seed or livestock.

Sequencing DNA: timeline of key events

Date Event People Places
August 26, 1895Johann Friedrich Miescher diedMiescher
June 3, 1929Werner Arber was born in Granichen, SwitzerlandArberUniversity of Geneva
December 1944Kary Banks Mullis was bornCetus Corporation
1952First observation of the modification of viruses by bacteriaLuria, HumanUniversity of Illinois
1955Sanger completes the full sequence of amino acids in insulinSangerCambridge University
1957Victor Ingram breaks the genetic code behind sickle-cell anaemia using Sanger's sequencing techniqueIngram, SangerCambridge University
1958Sanger awarded his first Nobel Prize in ChemistrySangerCambridge University
1960National Biomedical Research Foundation establishedLedleyGeorgetown University
1960Sanger begins to devise ways to sequence nucleic acids, starting with RNASangerCambridge University
1962Concept of restriction and modification enzymes bornArber, DussoixUniversity of Geneva
1962Sanger moves to the newly created Laboratory of Molecular Biology in CambridgeSangerLaboratory of Molecular Biololgy
1965Transfer RNA is the first nucleic acid molecule to be sequencedHolleyCornell University
1965Werner Arber predicts restriction enzymes could be used as a labortory tool to cleave DNAArberUniversity of Geneva
1965Atlas of Protein Sequence and Structure publishedDayhoffNational Biomedical Research Foundation
1965Ledley publishes Uses of Computers in Biology and MedicineLedleyNational Biomedical Research Foundation
1965Sanger and colleagues publish two-dimension partition sequencing methodSanger, Brownlee, BarrellLaboratory of Molecular Biology
1967First automatic protein sequencer developedEdman, BeggSt Vincent's School of Medical Research
1968 - 1970Restriction enzymes found to act as chemical knives to cut DNASmith, NathansUniversity of Geneva, University of California in Berkeley, Johns Hopkins University
1968The first partial sequence of a viral DNA is reportedWu, KaiserCornell University, Stanford University Medical School
1969First principles for PCR publishedKhorana, KleppeUniversity of Wisconsin-Madison
1969New species of bacterium is isolated from hot spring in Yellowstone National Park by Thomas BrockBrockCase Western Reserve University
July 1970First restriction enzyme isolated and characterisedSmith, WilcoxJohns Hopkins University
1971Process called repair replication for synthesising short DNA duplexes and single-stranded DNA by polymerases is publishedKhorana, KleppeMIT
May 1971Complete sequence of bacteriophage lambda DNA reportedWu, TaylorCornell University
December 1971First experiments published demonstrating the use of restriction enzymes to cut DNADanna, NathansJohns Hopkins University
1973The sequencing of 24 basepairs is reportedGilbert, MaxamHarvard University
1975Sanger and Coulson publish their plus minus method for DNA sequencingSanger, CoulsonLaboratory of Molecular Biology
1977Complete sequence of bacteriophage phi X174 DNA determinedSangerLaboratory of Molecular Biology
1977First computer programme written to help with the compilation and analysis of DNA sequence dataMcCallumLaboratory of Molecular Biology
February 1977Two different DNA sequencing methods published that allow for the rapid sequencing of long stretches of DNASanger, Maxam, GilbertHarvard University, Laboratory of Molecular Biology
October 1978Nobel Prize for discovery and understanding of restriction enzymesArber, Nathans, SmithJohns Hopkins University, University of Geneva
1980Sanger awarded his second Nobel Prize in ChemistrySanger, GilbertHarvard University, Laboratory of Molecular Biology
January 1980European Molecular Biology Laboratory convenes meeting on Computing and DNA SequencesEMBL
September 1980First DNA sequence database createdDayhoffNational Biomedical Research Foundation
1980Largest nucleic acid sequence database in the world made available free over telephone networkDayhoffNational Biomedical Research Foundation
1982Whole genome sequencing method is introduced for DNA sequencing
June 1982NIH agrees to provide US$3.2 million over 5 years to establish and maintain a nucleic sequence database
1983Polymerase chain reaction (PCR) starts to be developed as a technique to amplify DNAMullisCetus Corporation
June 1984Results from PCR experiments start being reportedMullisCetus Corporation
March 1985Mullis and Cetus Corporation filed patent for the PCR techniqueMullisCetus Corporation
March 1985DNA fingerprinting principle laid outJeffreysUniversity of Leicester
December 20, 1985The Polymerase Chain Reaction technique was publishedMullisCetus Corporation
1986First machine developed for automating DNA sequencingHood, Smith, HunkapillerCalifornia Institute of Technology, Applied Biosystems
1986Human Genome Organization founded
1988US Congress funds genome sequencing
April 1988Development of first rapid search computer programme to identify genes in a new sequencePearson, Lipman
April 1988First pitch for US Human Genome Project
October 1990Human Genome Project formally launched
1992GenBank is integrated into the NIH National Center for Biotechnology Information
July 1995Craig Venter's team at The Institute of Genomics Research (TIGR) published the first complete sequence of the 1.8 Mbp genome of a free-living organism (the bacterium Haemophilus influenzae)VenterThe Institute for Genomic Research
1996Complete genome sequence of the first eukaryotic organism, the yeast S. cerevisiae, is published
1996Pyrosequencing is introduced for DNA sequencingRonaghi, NyrenRoyal Institute of Technology
May 1998Commercial Human Genome Project launchedVenterCelera Genomics
December 1998Complete genome sequence of the first multicellular organism, the nematode worm Caenorhabditis elegans, is publishedSanger Institute, Washington University
1999First human chromosome sequence published
2000Complete sequences of the genomes of the fruit fly Drosophila and the first plant, Arabidopsis, are published
June 2000Human genome draft sequence announced
December 2000First plant DNA sequenced
February 2001First consensus sequence of human genome publishedCelera
2002Complete genome sequence of the first mammalian model organism, the mouse, is published
July 2002Poliovirus synthesisedStony Brook University
October 2002Genomic sequence of the principal malaria parasite and vector reported
April 2003The sequence of the first human genome was published
April 2003First DNA microarray diagnostic device approved
May 2006Last human chromosome is sequenced
January 2011DNA sequencing proves useful to documenting the rapid evolution of Streptococcus penumoniae in response to the application of vaccinesWellcome Trust Sanger Institute
June 2012DNA sequencing helps identify the source of an MRSA outbreak in a neornatal intensive care unitPeacock, ParkhillCambridge University, Wellcome Trust Sanger Institute
December 2012DNA sequencing utilised for identifying neurological disease conditions different from those given in the original diagnosisUniversity of California San Diego

Website design by Silico Research the creative minds behind BioPartnering.

Follow us to keep up with all the new content about the world of biotechnology.