Following previous efforts, the BGI (formerly known as Beijing Genomics Institute), based in Shenzhen, China, and its collaborators at the University Medical Centre Hamburg-Eppendorf, as well as a growing number of researchers around the world "crowdsourcing" this data, are exploring in-depth the European disease outbreak helping trace the origin and spread of the lethal E. coli strain. Different sources have reported that two strains, 01-09591 from Germany isolated in 2001 and 55989 from Central Africa in 2002, are highly similar to the 2011 outbreak strain. Based on the most recently curated assembly publically released by BGI, these strains have an identical Multi Locus Sequence Typing (ST678) based on analysis of seven important "housekeeping" genes.*
BGI's latest analysis indicates that the two German strains (01-09591 originally isolated in 2001 and TY2482 from the 2011 outbreak) have identical profiles for all 12 virulence/fitness genes and seven MLST housekeeping genes. However, at some point over this 10-year period the new 2011 outbreak strain seems to have developed the ability to resist many additional types of antibiotics. The latest data is now pointing to this candidate, as it now seems the African strain (strain 55989) is genetically more "distant" as the Shiga-toxin-producing gene and tellurite-resistance-genes were shown to be absent. The utility of so quickly sharing initial data is further supported as the link to this original strain has already been independently verified by other groups.
This latest evidence is that the previous 2001 German strain is the most likely ancestor of the 2011 outbreak strain. This may imply that fast evolution resulted in the gain of more genes during the last 10 years. Further comparisons between the genomes of these bacteria will greatly help clarify why the latest outbreak has been so exceptionally pathogenic on this occasion, and also provide clues on tracing the origin, spread and source of the disease, which would significantly aid the frontline health care workers fighting to control this new global outbreak.
Unfortunately, the 2001 German strain currently has no publically available genome sequence, although it was preliminarily analyzed during the original outbreak and stocks and samples are hopefully still stored. In the great strides already made by the community in just a few days from the sharing of its original genomic data, BGI is appealing for any labs which have isolates of this key strain to share samples and respective data for sequencing and analysis. The idea seems to be welcomed by scientific community.
"The origin (patient index) of the E. coli EHEC O104:H4 is not yet known. One may however remember that a strain with a similar surface antigen was isolated in Germany in 2001," says Antoine Danchin, PhD, DSci, microbiologist and founder of the HKU-Pasteur Research Centre. Danchin further stresses this in a recent posting: "Knowing the kinship between the present strain and the older one will be of the utmost importance." (http://www.normalesup.org/~adanchin/populus/journalist.html )
*Technical note: while there have been some reports that the sequence of MLST genes of 2001 Germany isolate and the 2011 outbreak strain may have minor differences, BGI bioinformaticians believe these discrepancies are likely due to sequencing and assembly errors. Using the most recently curated assembly from BGI, they have found that the three strains do have identical MLST. There is also likely confusion from previous reports doing MLST comparisons with a previous United States isolate with serotype O107. BGI has now found this U.S. isolate to be significantly different from the Germany 2001 isolate, the former having all seven MLST genes different from the outbreak strain although having the same serotype.