[Bioperl-l] Sorry in advance, not exactly a BIOPERL question, but sure you know about it...

Thomas Sharpton thomas.sharpton at gmail.com
Fri Mar 9 16:49:04 EST 2012

Hi Juan,

If I understand you correctly, you want to assemble viral genomes from
metagenomic reads. While assembly of metagenomic data can be
straightforward in some situations (e.g., low complexity communities), it
is generally difficult and can often result in chimeras. By mapping
sequences to reference genomes (i.e., fragment recruitment), you can
effectively reduce the complexity of the community in silico and
subsequently reduce the possibility of chimeric errors.

That said, reference genomes frequently represent a relatively small subset
of the total diversity in a community, so you might have to adopt liberal
mapping parameters if you want to minimized the amount of bacterial,
archaeal and eukaryotic DNA in your metagenome. This could, of course,
result in the spurious filtering of viral reads that happen to share some
similiarity with a reference genome. Personally, I would prefer to lose
some of the viral reads and produce incomplete assemblies if I was
confident that it would decrease the chance of chimeric assemblies. Then
again, I personally try to avoid assembly from metagenomic data when
possible, so I may be lending biased advice.

The DeRisi lab has done some great work on the subject of viral genome
assembly from metagenomic data. I recommend you take a look at PRICE, which
you can download from the link below:


I'm not sure I specifically answered your question. I do hope this helps
and would be happy to talk more, if you like. But I'm certainly no
metagenomics assembly expert. And we might move this conversation off the
list given that it isn't quite on topic.

Good luck,

On Fri, Mar 9, 2012 at 9:33 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Mar 9, 2012 at 5:24 PM, Juan Jovel <jovel_juan at hotmail.com> wrote:
> >
> > Hello All!
> > No clear to me what generally speaking is the advantage of filtering out
> reads
> > when we are interested in de novo assembly of specific taxonomic groups
> > (i.e. bacteria, viruses, fungi, etc). More specifically, my questions
> are:
> > 1. In a metagenomics library, if I am interested in de novo assembly of
> virus
> > genomes, should I remove bacterial and human reads (that's what I do),
> and
> > leave in phages and known viral sequences.
> That might work - but you may remove virus reads mapping onto integrated
> prophage inside the bacterial etc references you use. Be careful.
> Peter
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list