[Bioperl-l] Mummer parser and data anlysis

shalabh sharma shalabh.sharma7 at gmail.com
Wed Oct 20 14:35:54 EDT 2010


I have used MEGAN a lot in past, but again it takes blast files as input and
here i just want to avoid blast.
The problem is its environmental sample so i cant throw any of the data.

I know its not a bioperl question but i was just wondering is there any
program that can align my reads against a huge database at amino acid level
(like with refseq or subset of refseq).
or even alignment is a good option?

I really appreciate your input.

Thanks
Shalabh


On Wed, Oct 20, 2010 at 2:25 PM, Chris Fields <cjfields at illinois.edu> wrote:

> I recall there being a lot of tools available for these.  In particular,
> one of my colleagues has used MEGAN with some success:
>
> http://www-ab.informatik.uni-tuebingen.de/software/megan
>
> If the sample is from a specific host (i.e. gut microbiome, etc), you can
> set up initial short read runs that act to filter out sequences you might
> not be interested in (namely those that belong to the host), then run
> alignments against more focused databases (rRNA, for instance, if one is
> doing meta-transcriptomic analyses).  Beyond that, I agree that assembly
> should be included early in the analysis, if it isn't already the initial
> step.
>
> chris
>
> On Oct 20, 2010, at 11:35 AM, shalabh sharma wrote:
>
> > Hey Chris,
> >              Thanks for the reply , it was really useful.
> > Actually you are right, it is metagenomics sample. The thing is i've
> never worked with that huge amount of data, so i am trying to test some
> alignment programs (i am just trying to see if i can avoid blastx) so i am
> trying all the available programs.
> >
> > Blasting 200 million reads doesn't seems a right option (may be i will go
> with assembly then blasting it).
> >
> > Thanks
> > Shalabh
> >
> >
> > On Wed, Oct 20, 2010 at 11:45 AM, Chris Fields <cjfields at illinois.edu>
> wrote:
> > On Oct 20, 2010, at 10:00 AM, shalabh sharma wrote:
> >
> > > Hi All,
> > >        Is there any module for mummer in Bioperl?
> > >
> > > Also i need some suggestions and ideas (i think this is the best place
> to
> > > ask).
> > > I am working with huge data (around 200 million illumina reads),
> earlier i
> > > was using blastx and other similar approaches to annotate but now i
> think
> > > thats not possible, i would be very grateful if anyone can give me some
> idea
> > > regarding this.
> > >
> > > Thanks
> > > Shalabh
> >
> > Hard to say unless we know a little more about what you are attempting to
> do.  Not sure why you are using mummer here, but...
> >
> > This is something fairly well-covered in the literature for most use
> cases, and on places like seqanswers.  If you are doing something like
> aligning reads to reference genome(s) or set of gene models, you should be
> using something like bowtie/tophat, bwa, etc., with the output in SAM
> (BioPerl has perl wrappers for most of these modules).
> >
> > You can also do the same for metagenome analyses, but you may need to run
> BLAST and convert to SAM (maybe that's what you are doing?).  The samtools
> package comes with perl scripts to do that and can be further used to sort
> the matches, convert/index a BAM file for fast accession, etc.  From there
> you can then use tools like Bio::DB::SAM, R/BioConductor/RSamtools, or
> similar to access the sequences, find coverage statistics, run SNP calls,
> etc.
> >
> > And, for the record, we do have an experimental mummer parser, but I
> believe it lies in a branch at the moment (don't think it has been merged
> yet):
> >
> > http://github.com/bioperl/bioperl-live/tree/topic/bug-2701
> >
> > chris
> >
>
>


More information about the Bioperl-l mailing list