[Bioperl-l] fetching all alignments from a sam/bam by read header in perl
j_martin at lbl.gov
Sun Feb 26 11:39:16 EST 2012
Sort the bam by name so all hits are adjacent. If you need to subsequently
do random lookups then you could add / alter tags for each read with
multiple hits indicating where those hits are and resort the bam by
On Sat, Feb 25, 2012 at 10:24 PM, Abhishek Pratap <abhishek.vit at gmail.com>wrote:
> Hi Guys
> Reading the doc page for Bio::DB::SAM I see there is a way to fetch reads
> by name (read id) but the documentation also says this is slow.(copied
> below). I need to do about 300-500 million look ups and if each one is
> costly I wanted to know if there is another slick low level way. For my
> application I would not have feature location just the read name.
> -name Filter on reads with the designated name. Note that
> this can be a slow operation unless accompanied by
> the feature location as well.
> On Fri, Feb 24, 2012 at 6:58 AM, Abhishek Pratap <abhishek.vit at gmail.com
> > Hi Peter
> > You got it right.
> > Here is the link :
> > -A
> > On Fri, Feb 24, 2012 at 1:24 AM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> > > On Fri, Feb 24, 2012 at 12:55 AM, Abhishek Pratap
> > > <abhishek.vit at gmail.com> wrote:
> > >> I am wondering if there is a slick way access all the possible
> > >> alignments for a read present in sam or bam file given the read
> > >> header. Since the existing codebase is in perl I would prefer
> > >> something which can be done in/via perl.
> > >>
> > >> By default BAM's are indexed by location so the inbuilt samtools
> > >> indexing wont work I guess.
> > >>
> > >> I should also say the input bam file will have in the order of 500
> > >> million total alignments and many reads are expected to be aligned to
> > >> more than one place in the genome. Given the size of the data loading
> > >> it all in one big hash is not turning out to be memory friendly.
> > >
> > > Are you asking for SAM/BAM read lookup by read name?
> > >
> > >> PS: I also posted this earlier on Biostar.
> > >
> > > Link?
> > >
> > > Peter
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l