Appendix 8: CLIMB Data
The figures in this appendix provide a detailed overview of the volume of sequencing undertaken by different COG-UK partners between 1st March 2020 and 1st August 2021 as recorded in the CLIMB (Cloud Infrastructure for Big Data Microbial Bioinformatics) database.
Data pertaining to numbers of genomes published by individual sequencing organisations were downloaded from the CLIMB database and provided by Tom Brier from the University of Birmingham. Data was handled and plots generated within R version 4.2.3 (2023-03-15), by Daniel Power, using packages; tidyverse_1.3.2, reshape_0.8.9, knitr_1.40, zoo_1.8-11, ggplot2_3.3.6, ggrepel_0.9.2.
Cumulative numbers of genomes published by each collective institute type combined with individual publication numbers

Figure 1: The cumulative number of genomes published by each collective institution type are plotted against time, with the black line showing the consortium’s combined total. The points in the scatterplot show each individual publication of genomes and are coloured and shaped according to institute type. These are plotted on a logarithmic scale. Credit: Daniel Power.
Numbers of genomes sequenced by each individual sequencing organisation

Figure 2: Number of genomes published by each organisation are shown by the bars, and the cumulative sums of these are plotted in black. The date of each organisation’s first publication is displayed in the top left, and their final total publication numbers, as of August 2021, are displayed in the top right corner of each plot. Credit: Daniel Power.
Genome numbers published along with daily sequencing rates by institution type

Figure 3: Individual publication numbers are represented by dots, coloured by institution type. The line of best fit, generated by taking the 7 day rolling mean of the daily total institution numbers and then applying ggplot geom_smooth()’s locally estimated scatterplot smoothing (loess) method, with a span of 0.42, across each institution type to show the daily sequencing rates. Credit: Daniel Power.
Made possible by the generous support of COVID-19 Genomics UK Consortium (COG-UK)

COG-UK was supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) (grant code: MC_PC_19027), and Genome Research Limited, operating as the Wellcome Sanger Institute. This exhibition acknowledges use of data generated through the COVID-19 Genomics Programme funded by the Department of Health and Social Care. The views expressed are those of the author and not necessarily those of the Department of Health and Social Care or UKHSA.
Respond to or comment on this page on our feeds on Facebook, Instagram, Mastodon or Twitter.