[Bioperl-l] Getting sequences by base pair locations
Kevin.M.Brown at asu.edu
Tue Aug 1 18:43:00 EDT 2006
Perl Mechanize is a great way to submit web forms repeatedly. I do it
for things like MHC epitope prediction sites as well as a way to grab
things like journal articles matching certain keywords.
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Cook, Malcolm
> Sent: Tuesday, August 01, 2006 8:12 AM
> To: Yuval Itan; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Getting sequences by base pair locations
> Glad to help. Given that you are not running blat suite
> locally, but at
> ucsc, you should try this approach:
> upload/paste your blat results (in blat's native output
> format, psl) as
> a custom track in the genome browser, named, say, myhumanhits
> (i.e. just give the blat results a new first line like: `track
> name="myhumanhits" description="myhumanhits from my favorite human
> genes" visibility=2`)
> then goto the table browser and configure it
> group = 'custom tracks'
> track = 'myhumanhits'
> retion = genome
> output format = sequence
> output file = myhumanhits.fasta
> submit it
> When prompted, Save the myhumanhits.fasta to your computer and take it
> from there.
> I'm not sure how many hits this will work for, but i just did
> this on a
> small track and it works just fine. Only problem, the first
> word in the
> fasta defline is always the same for all sequences. You'll have to
> 'uniqify' these names somehow probably (depedning of course on your
> Let us know & Good luck & ask for good email support on ucsc genome
> browser subscribe to
> Malcolm Cook
> Database Applications Manager, Bioinformatics
> Stowers Institute for Medical Research
> >-----Original Message-----
> >From: bioperl-l-bounces at lists.open-bio.org
> >[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Yuval Itan
> >Sent: Tuesday, August 01, 2006 8:36 AM
> >To: bioperl-l at lists.open-bio.org
> >Subject: Re: [Bioperl-l] Getting sequences by base pair locations
> >Thank you all for all the helpful answers!
> >Malcolm- I've used the UCSC server to do the BLAT search (because I
> >couldn't run it locally due to memory problems)- so I could
> >not get the
> >chimp sequences in a convenient way. I have the results also in a
> >normal Blat output including all usual fields: chromosome number etc.
> >Wade- thanks a lot for your offer, that would be great. The chimp
> >genome is just one large fasta format file.
> >On 28 Jul 2006, at 14:30, Sean Davis wrote:
> >> Yuval Itan wrote:
> >>> Hello all,
> >>> I was BLATing a few hundred human genes against the chimp
> >genome, and
> >>> kept the best chimp hits for every human gene.
> >>> I have the base pair start and end location for every chimp
> >hit, and
> >>> I need to get the sequence for each of these chimp hits.
> Here is an
> >>> example for a few chimp hits bp locations:
> >>> Start End*
> >>> *142854 144504
> >>> 154479 155198
> >>> 153066 167370
> >>> 163146 163559
> >>> I have one chimp genome file (about 3GB) including all
> >>> but I could also get one file per chromosome if that would make
> >>> things easier. Does anyone have a script or a link for an
> >>> that can do the job?
> >Bioperl-l mailing list
> >Bioperl-l at lists.open-bio.org
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l