Comparative geneticists around the world just gave a collective gasp. The New York Times has just posted an article about research in the journal PLoS One showing that about 1/5 of nonhuman genome sequences in computerized databases around the world contain sequences of human DNA - mostly from the researchers who prepared the sample for sequencing. Says reporter Nicholas Wade:
The contamination may mislead researchers who assume that any genome sequence in a major databank is highly accurate. Dr. Rachel O’Neill said the problem is likely to become more serious in the future as individual human genomes are sequenced for medical reasons. Contamination of human samples by other human DNA is very hard to distinguish from normal variation, and could lead to erroneous medical decisions.
“The level of contamination found in these databases is significant and worrisome,” the researchers write.
Now I'm trying to remember if I contributed any sequences as a graduate student to either of the two databases surveyed by the researchers (NCBI and Ensembl - for those of you in the know). It's probable that I did. Did I unintentionally also immortalize bits of my own genetic sequences as well? I'm fascinated by the idea. But, if so, I really hope those rogue nucleotides don't mess up anybody else's experiments. I did enough of that when I was actually in the lab!