This week's edition of Nature has a brief paper (doi:10.1038/nature07390) reporting on the identification of an HIV positive tissue sample collected in Leopoldville (now Kinshasa) in what was then the Belgian Congo, and now known as the Democratic Republic of the Congo. Sequence data derived from the tissue was used to investigate the chronology of the appearance of HIV from its likely simian origin.
This is a piece of research which hit the news services (see for example this page at the BBC news website). The research has a number of features which earmark it for media interest: an important virus, a serious disease with a global spread, and a simple take-home message as to the origin of the virus. This raised my interest and I looked at the paper. Incidentally, the paper raises issues to do with complexity of statistical analysis: I imagine many readers such as I, and the journos who wrote articles in the press, have little or no chance of understanding what an "unconstrained Bayesian Markov chain Monte Carlo method" is, and are similarly limited in one's real critical analysis of conclusions reached by that means! I am forced to assume that all is above board in the statistical and computational aspects of this paper, and that the referees have done their job! In addition, it's always interesting in studies of ancient DNA (and cases where sample preservation was not originally intended to preserve nucleic acids) to know what measures were taken to ensure that contamination with modern DNA did not happen.
HIV is thought to have entered human populations relatively recently, most likely derived from chimpanzee SIV-1. A lot of work's been done on the evolution of this virus: on the basis of DNA sequence analysis (of cDNA derived from HIV samples) HIV isolates fall into three groups, and one of these, the M group, is the one that's spread globally, being responsible for >95% of HIV infections. A phylogeny of HIV and related viruses is shown below (and another here).
Within Group M are a number of subtypes, named A-M. There's a relationship beween geographical distribution and each of these subtypes, whih appear to have resulted from independent founder events. Subtype B is found in North America and Europe (and pretty much all the isolates from those areas fall into B). The novel isolate described here (DRC60) comes from subtype A, while a previous African archival sample (ZR59 - dating from 1959) is of subtype D.
27 archived samples of patient tissue were screened by RT-PCR, and one was identified as containing HIV RNA. I mentioned above that extreme care needs to be taken when attempting to recover ancient DNA (for example from museum or archaeological samples). There are several reasons for this:
- the Polymerase Chain Reaction (PCR) is extremely sensitive to contamination with extraneous DNA - for example modern DNA - from other work going on in the lab
- the DNA (or in this case RNA) won't have been preserved in ideal conditions - this makes PCR amplification of the ancient samples difficult (and makes the possibility of recovering products from contamination modern DNA more likely)
Most labs dealing with ancient DNA samples take a number of measures to ensure their work isn't compromised by contamination. One recent review of this criteria can be found in this article in Trends in Ecology and Evolution, and this article in PLoS One. I don't know whether this list of criteria is the canonical one used in the field: certainly several points would be relevant to any PCR experiments! Steps taken by Worobey et al to avoid (and detect any) contamination are listed in detail in the supplementary data. Chief among these are:
- work was carried out in labs experienced in recovering ancient nucleic acids
- results reproducible across repeated independent extractions
- samples analysed in two independent laboratories
- evaluation of mRNA quality using RT-PCR to amplify an endogenous human gene (this revealed mRNA quality to be poor - only short fragments could be recovered)
- as with the earlier ZR59 sample, the new sample appeared to be basal in the phylogeny
- a number of measures taken to avoid contamination are described in the Methods section
So, to return to the question as to why this hit the media. I guess that without knowning what press-releases were circulated (see my prior posting on my own BBSRC-mediated press-release), the answer is that there is a single, easily understood message - that the origins of HIV are a bit clearer with this new data point. Clearly the details of the analysis (which re beyond my capacity to critically review) are going to be lost in a news report! Where does the research take us beyond an interesting take on the history of a pandemic (and it's not unique - recall the studies on the 1918-19 flu virus isolates)? Well, the potential for new disease emerging from animal reservoirs is always there, particularly where human populations co-exist in close proximity to wild animal populations, and an understanding of the dynamics of novel disease emergence will prove important.
Michael Worobey, Marlea Gemmel, Dirk E. Teuwen, Tamara Haselkorn, Kevin Kunstman, Michael Bunce, Jean-Jacques Muyembe, Jean-Marie M. Kabongo, Raphaël M. Kalengayi, Eric Van Marck, M. Thomas P. Gilbert, Steven M. Wolinsky (2008). Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960 Nature, 455 (7213), 661-664 DOI: 10.1038/nature07390