Emergence of microbial population genetics

One of the most important means to understand and prevent the spread of AMR is to pinpoint the origin, sources, reservoirs and circulation patterns of the microbes responsible for the transmission of resistance. For many years this was difficult to work out because scientists lacked the appropriate tools to compare and discriminate between different microbial strains. For decades, bacteria were identified and classified by microscopic examination, staining, serotyping and phage typing. Click here for more on early identification methods.

The early techniques, however, were not sufficiently discriminatory for picking up on all the different lineages and strains, which is essential for tracking the evolution of resistance and increases in infection frequency and severity. For instance, although the techniques could identify bacterial species and some different subtypes they did not yield data on ancestry (phylogeny) or relationships within a sub-type. Thus, they were not suitable for studying the detailed global or local spread of bacterial strains, many of which share too many similarities genetically to be distinguished from each other (Spratt interview; Spratt autobiography; Ashton).

Many different pathogenic bacterial species had been described and sub-divided by the mid-twentieth century. This had been achieved largely on the basis of techniques that characterised observable characteristics - phenotypes. Such traits, however, can be highly variable and can change very rapidly in response to specific selection pressures. In the case of serotyping, this approach looks for a cell surface marker that has the potential to evolve slowly. This change can be driven by the immune system of those infected. Similarly with phage typing, the marker evaluated is one that can change as part of the bacteria's evolutionary race to escape attack by a virus (Spratt interview; Spratt autobiography). In each case, although variable, each discriminatable form of the marker has to persist long enough to be useful for epidemiology.

One of the first developments towards a more discriminatory system was based on gel electrophoresis, a technique developed in the 1960s that involves passing an electric current over a gel to separate biological molecules based on the size of their fragments or their charge. Separation is achieved because the electric current pushes smaller or differentially charged fragments faster through the gel. Perhaps the first application of gel electrophoresis for bacterial typing was reported in 1963. It was used to distinguish different pathogenic bacterial strains that infect insects (Norris, Burger).

Figure: 6.1: Apparatus for gel electrophoresis (left) and typical readout (right). The technique involves loading a sample into wells on one side of a gel and applying an electric current to pull them through the current. The different molecules can be distinguished because they travel through the gel at different speeds. The readout shows a typical result of DNA electrophoresis, showing the movement of different fragments (credit: Wikipedia).

Gel electrophoresis moved towards a genetic solution in 1973 when Roger Milkman, a zoologist based at the University of Iowa, began to explore its use to test out the neutral mutation-random drift hypothesis in bacteria. This hypothesis suggests that most shorter term evolutionary change at the molecular level and indeed most variability within species is not caused by selection of the fittest, as proposed by Charles Darwin, but rather by a random molecular drift towards mutant genes (Spratt interview; Spratt autobiography; Rocha). In order to test out the neutral theory Milkman needed to find genetically-linked markers and hit upon looking at the variation in metabolic enzymes because these evolve naturally at a relatively slow rate. He first applied the method to the investigation of diversity in Escherichia coli, an intestinal bacteria (Milkman).

Called 'multilocus enzyme electrophoresis' or MLEE for short, Milkman's technique involved running a number of extracts from different bacterial strains on a gel and then soaking the gel with a chemical stain that highlighted where particular enzymes appeared on the gel. The stain made it possible to characterise different bacterial isolates based on the location patterns of the enzymes (Spratt interview). He developed this system to identify bacterial species, sub-species and biotypes.

The method quickly grabbed the attention of Robert Selander, an evolutionary biologist at the University of Rochester. In 1980, he and colleagues published a paper outlining its use for determining the frequency of the transfer or exchange of genes encoding enzymes in E. coli hosted by humans. Their work provided some indirect evidence for a low level of gene transfer, a process also known as recombination, in natural populations of E. coli. When it did occur it was sometimes down to genes transferred by plasmids, or potentially other mobile genetic elements found within bacteria (such as transposons, integrons, bacteriophage elements) (Caugant, Levin, Selander). Selander's laboratory soon became a major centre for MLEE to characterise other bacterial pathogen populations, helped by its preparation of a compendium of methods for enzyme extraction, gel electrophoresis and specific enzyme staining (Selander et al 1986).

Figure 6.2: Photograph of Robert K Selander (1927-2015). Credit: Penn State University. Born in Salt Lake City, Utah, Selander trained in zoology and initially focused on researching the behaviour and evolution of birds, particularly house sparrows, before turning his attention to the application of molecular genetic approaches to understanding bacterial population structures. Some of the bacterial species he worked on were Salmonella, a pathogen linked to food-borne diseases, and Neisseria meningitidis, a common cause of meningitis and septicemia.

Importantly, Selander's pioneering work opened up a new chapter in clinical microbiology and was instrumental in the foundation of bacterial molecular epidemiology. With MLEE, scientists now had a tool with which to characterise the isolates of bacterial pathogens and identify particular strains or subtypes associated with disease based on variation in several housekeeping enzymes. The method also made it possible to establish which strains were very common or very rare and to undertake some general geographic analysis of where they were found. It soon became clear that while bacterial pathogen isolates were genetically diverse, a large proportion of disease worldwide was caused by only a small number of strains (Spratt interview, Spratt Foreword; Selander et al 1987).

Based on his MLEE studies Selander concluded that bacteria populations were highly clonal - that is they predominantly reproduced asexually by vertically passing on a single genome copied from their parents. However, gene exchange was occurring and these events were increasing diversity within populations, a factor that could facilitate more detailed discrimination within bacterial species. Salender's approach detected the overall conservation within bacterial clones but could also identify the few recombination genes which were being exchanged between lineages. As these were relatively rare events, isolates tended to present with exactly the same MLEE profile in countries that were very widely geographically apart and when separated by many decades (Spratt interview; Spratt 2020). However, he could also identify rare novel outliers, but not well. Additionally, his system was not good at identifying ancestry or phylogeny.

By the late 1980s, however, research carried out by Brian Spratt, a microbiologist at Sussex University working on bacterial mechanisms of resistance to penicillin, suggested that recombination was much more common in bacteria than Selander's low resolution approach had found. In 1987 he began investigating the penicillin sensitivity of Neisseria gonorrhoeae, a species of bacteria that causes sexually transmitted genitourinary infections as well as other forms of disease including inflammation of the eye in newborns, which if left untreated can cause blindness. Treated extensively with penicillin, several strains of the bacteria had developed resistance to the drug since the 1960s. Spratt wanted to know if this resistance was caused by alterations of the genes that code for proteins which he had previously established play a key role in synthesis of the bacterial cell wall that penicillin targets to inhibit cell division and thereby kill the bacteria.

Figure 6.3: Photograph of Brian Spratt, 1993, credit: Royal Society, Brian Spratt. Born in Margate, Kent, in 1947, Spratt undertook both a bachelor's degree and a doctorate in microbiology at University College London. His doctorate focused on the genetics of DNA synthesis and cell division in Salmonella typhimurium. In 1973 Spratt joined Arthur Pardee's biochemistry laboratory at Princeton University for two years where he began investigating penicillin-binding proteins (PBPs) in bacteria as part of his work to understand cell division. During this research he pioneered a method to isolate multiple PBPs of E. coli which enabled him to dissect their role in the elongation, shape and division of the bacteria. Armed with this method, Spratt managed to identify the target of penicillin for the first time. His method also proved useful to pharmaceutical companies who used it to establish how well new beta-lactam antibiotics bound to individual killing targets. It also opened up a means to study the role of PBP mutations in the rise of resistance to antibiotics (Spratt 2012). Following his time in Princeton, Spratt continued working on PBPs at Leicester University with the support of an MRC grant and then became a lecturer in biochemistry at Sussex University in 1980.

In order to do this, he compared the nucleotide sequences for the protein taken from penicillin-sensitive and penicillin-resistant strains. Expecting to find one or two mutations in the protein taken from the resistant strain, Spratt was surprised to discover that the first part of its sequence was largely similar to the one from the penicillin-sensitive strain but then there were great blocks that were very different. Spratt was very puzzled as to how there could be so many changes in one part of the sequence and none in the rest. The only explanation he could come up with was that some sort of recombination event had happened whereby new pieces of DNA had been transferred into the sequence from a closely related Neisseria species and that this somehow rendered the protein less susceptible to being inhibited by penicillin (Spratt interview).

Initially unsure about his hypothesis, Spratt discussed it with John Maynard Smith, a theoretical evolutionary biologist and geneticist at Sussex University. Maynard Smith confirmed his interpretation and Spratt published his results in Nature in 1988. It was the first example of a bacteria that had become resistant by recombination with a very close relative with resistance to penicillin (Spratt interview; Spratt 1988). The following year Spratt's team discovered the same phenomenon in clinical isolates of Neisseria meningitidis and Streptococcus pneumoniae that were resistant to penicillin (Dowson, Hutchinson, Spratt; Zhang, Spratt).

Figure 6.4: Photograph of John Maynard Smith, 1984, taken by David Streeter and Colin Atherton (credit: SxUOS1/3/23/19 University of Sussex Collection, University of Sussex Special Collections at The Keep). Born in London, John Maynard Smith (1920-2004) developed an interest in natural history early in his childhood when his family moved to Exmoor. Initially Maynard Smith trained as an aeronautical engineer and then did a second degree in zoology, focusing on the fruit fly genetics at University College London. Taught by the biologist J.B.S. Haldane at UCL, Maynard Smith developed a deep interest in population genetics and evolution, which became the focus of his subsequent career. He is best known for his application of mathematical tools to explain and predict evolutionary behaviour. The last ten years of his life Maynard Smith devoted to studying the evolution of bacteria and antibiotics resistance.

With the new findings challenging Selander's view that bacteria were highly clonal, Spratt and Maynard Smith decided to look more closely into the issue by examining all the MLEE data that had been collected on bacterial populations. It soon became clear that Selander had based his conclusions on isolates from bacterial strains that cause clinical disease, which are very rare and make up only a small minority of isolates compared to those taken from asymptomatic carriers. His interpretation that recombination was a rare event was therefore based on a biased population sample. In 1993 Spratt and Maynard Smith published a highly influential paper demonstrating that while some bacterial species were clonal, in others recombination was common. The fact that recombination was much more common than previously assumed had major implications for understanding antimicrobial resistance. This is because recombination of blocks of DNA makes it much easier for bacteria to develop resistance than in the case of simple mutations (Spratt interview ; Maynard Smith et al).

Figure 6.5: Diagram showing the spectrum of bacteria population structures. Figure 1 from Maynard Smith et al. Copyright (1993) National Academy of Sciences, U.S.A: .

While MLEE had helped to uncover the existence of recombination in bacteria, Spratt grasped early on that it was not an ideal tool for tracking the development of drug resistance in bacteria around the world. Although the principles behind MLEE were sound - indexing variation between different neutral markers - it was tedious to set up and involved many fiddly and time-consuming steps. It was also hard to compare results between laboratories and did not make sufficient discrimination between closely related strains (Spratt interview; Maiden et al).

In the early 1990s Spratt began discussing how to improve MLEE with Martin Maiden, a molecular microbiologist at the National Institute of Biological Standards and Control working on the characterisation of Neisseria meningitidis. They were keen to transform it into a sequence based method. This was technically daunting on two fronts. Firstly, automated DNA sequencing was still in its infancy. Secondly, no one had yet managed to sequence the full genome for a microbe which meant they had no data to draw on to validate a new method (Spratt autobiography).

For some time nothing happened until 1994 when Maiden managed to persuade a group of scientists in four different laboratories to sequence a number of housekeeping metabolic gene fragments from a reference set of N. meningitidis strains put together by Dominique Caugant (Norwegian Institute of Public Health) and Mark Actman (Warwick University). The N. meningitidis species was ideal for validating the new method because MLEE had already helped to establish some of the early techniques and some data in intra-species relationships was available. Cougant and Actman's reference set contained 107 isolates of N. meningitidis collected from invasive disease and healthy carriers (Maiden et al).

Figure 6.6: Photomicrograph of gram stained Neisseria meningitidis in cerebrospinal fluid as seen under a microscope at 1000 times magnification. Credit: Wikipedia, Microman12345. Often referred to as meningococcus, N. meningitidis is carried by humans in their nasopharynx and can be spread by saliva and respiratory secretions such as when coughing, sneezing and kissing. The bacteria is the main cause of bacterial meningitis in children and young adults. Initially the bacteria can cause symptoms like fatigue, fever and headache, and then progress on to cause a stiff neck, coma, development impairment and death in 10 percent of cases. Meningococcus can also cause septicaemia, which is manifested in a rash or red or purple discoloured spots on the skin which do not disappear when pressure is applied.

By 1996 Jiaji Zhou, a Chinese-born microbiologist working with Spratt, had managed to sequence the first two DNA fragments from the reference set with encouraging results. Her achievement helped to galvanise the other laboratories to start sequencing additional housekeeping metabolic genes. Two years later the group reported they had successfully sequenced more than 470 base pair fragments from 11 housekeeping genes in the reference set on the back of which they had managed to develop a new molecular system for the characterisation of pathogenic microorganisms. They called their new system 'multilocus sequence typing' or MLST for short (Maiden et al).

Figure 6.7: Photograph of Jiaji Zhou (1959-2017), credit: Brian Spratt. Zhou completed a medical degree at Beijing and left China shortly after the Tiananmen Square protests in 1989 after securing a visa to study English in England. She landed up at Sussex University because she had a good family friend there and it was through him that she first met Brian Spratt who took her on as a post-doctoral researcher (Spratt interview).

The new MLST system had two major advantages. First, it detected much more variation than was possible with MLEE. Second, the sequence data was much easier to compare between laboratories. In addition the team had developed a single expanding central multilocus sequence database to be used alongside the MLST technique. Hosted online, this database made it possible to upload and interrogate data relating to different species. With the database containing many thousands of strains, it was now possible for anyone from anywhere in the world to determine whether a strain had already been identified, where it had been seen before. It also provided a tool for assessing how many changes a strain had undergone, when and where this had happened, and how often these changes were caused by a mutation or a recombination event (Spratt 1988; Maiden et al). Indeed, this was the first step towards a phylogenetic method for typing and analysis based on ancestry. Additionally, bacteriologists were able to get a handle on likely mutation rates for the first time.

Now having provided some sort of genetic marker of the strain of pathogen that caused a disease, the MLST system opened up a new chapter for epidemiological surveillance and public health decisions. Importantly, it made it possible to determine whether isolates collected from a localised outbreak of disease were all the same or were different strains. It also provided a means to work out the extent to which strains in one geographic region were related to each other and to strains circulating elsewhere in the world. The MLST's high degree of precision also made it possible to pinpoint when strains shared a recent common ancestor or a more distant ancestor (Maiden et al). Further, the system provided a simple digitised method for exchanging data accurately between laboratories.


Ashton FE (1990) 'Multilocus enzyme electrophoresis - Application to the study of meningococcal meningitis and listeriosis', Canadian Journal Infectious Diseases, 1/14, 146–48.Back

Caugant, DA, Levin, BR, Selander, RK (July 1981) 'Genetic diversity and temporal variation in the E. Coli population of a human host', Genetics, 98/3, 467-90.Back

Dowson, CG, Hutchinson, A, Spratt, B (25 Sept 1989) 'Nucleotide sequence of the penicillin-binding protein 2B gene of Streptococcus pneumoniae strain R6', Nucleic Acids Research, 17/18.Back

Maiden, CJ et al (17 March 1998) 'Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms', Proceedings of the National Academy of Sciences USA, 95/6, 3140-45.Back

Maynard Smith, J, Smith NH, O'Rourke, M, Spratt, BG (May 1993)'How clonal are bacteria?', Proceedings of the National Academy of Sciences USA, 90, 4384-88.Back

Milkman R (1973), 'Electrophoresis variation in Escherichia coli from natural sources', Science, 182, 1024-26.Back

Norris, JR, Burger HD (1963), 'Esterases of crystalliferous bacteria pathogenic for insects: epizootiological application', Journal of Insect Pathology, 5, 460-72.Back

Rocha, EPC (June 2018)'Neutral Theory, Microbial Practice: Challenges in bacterial population genetics', Molecular Biology and Evolution, 35/6, 1338–47.Back

Selander, R, et al (May 1986) 'Methods of population genetics and systematics', Applied and Environmental Microbiology, 51/5, 873-84.Back

Selander, RK, et al (1987)'Population genetics of pathogenic bacteria', Microbial Pathogenesis, 3, 1-7.Back

Spratt, B (10 March 1988) 'Hybrid penicillin-binding proteins in penicillin-resistant strains of Neisseria gonorrhoeae', Nature, 332, 173-76.Back

Spratt, B (unpublished),'Autobiography'.Back

Spratt, B, 'Foreword' (2010) in DA Robinson, D Falush, EJ feil, eds, Bacterial Population Genetics in Infectious Diseases, xi-xiii.Back

B Spratt (July 2012)'The 2011 Garrod Lecture: From penicillin-binding proteins to molecular epidemiology', Journal of Antimicrobial Chemotherapy, 67/7, 1578–88.Back

Spratt, B (2020) 'Microbial Genomics: Standing on the shoulders of giants', Microbial Society.

Spratt, Brian, interview by Lara Marks, 2 March 2020.Back

Zhang, Q-Y, Spratt, BG (11 July 1989) 'Nucleotide sequence of the penicillin-binding protein 2 gene of Neisseria meningitidis', Nucleic Acids Research, 17/13, 5383.Back

Respond to or comment on this page on our feeds on Facebook, Instagram, Mastodon or Twitter.