A 1000 human genomes…and some mycoplasma too

The revolution in rapid and cost-effective, high-throughput sequencing technologies have set a new trend in large-scale biomedical research. With such vast amounts of data being produced, the control of basic sequence quality downstream can present several challenges, with sequence contamination being one key area of concern. Mycoplasma are one of the most common contaminants of cell cultures. These minute bacteria lack cell walls and are particularly problematic due to difficulty in detecting their presence, even using light microscopes. But to what extent is contamination by mycoplasma corrupting downstream sequence databases? William Langdon, from the Department of Computer Science at University College London, UK, sought to address this question in his study in BioData Mining.

Langdon analysed the 1000 Genomes Project database – a large, highly respected, international study that has made its data publically available to researchers worldwide and aims to produce a detailed catalogue of human DNA variation. By downloading and scanning a random sample of more than 50 billion DNA sequences from diverse data sources, and mapping them against published genomes, he found tens of thousands of sequences that may have come from mycoplasma contamination. While many matches were of low quality, NCBI BLAST searches confirmed that some high quality, low entropy sequences matched mycoplasma strains. Overall, these results suggested that at least seven percent of public data provided by the 1000 Genome Project may be contaminated with mycoplasma.

The results probably come as little surprise to those who are already aware of the troublesome mycoplasma in molecular biology laboratories. While presenting a cause for concern, cross-species contamination in single-species databases such as the 1000 Genomes Project is relatively easy to screen for as compared to contamination from other members of the same species. However, as ever-increasing amounts of genomic data become available in the public domain and in silico research in biology grows, Langdon’s study highlights the need for further independent studies into sequence contamination of large databases.

A 1000 human genomes…and some mycoplasma too

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112