[Bioperl-l] New to bioperl

Frank Schwach fs5 at sanger.ac.uk
Tue Jan 24 10:46:04 EST 2012

Hi Joel,

This is certainly possible with BioPerl and it would be a great way of 
learning Perl and BioPerl with a real application in mind, but you have 
to be aware of the fact that the initial learning curve will be steep 
and you will need to invest quite a bit of time to get going.
If you want to do it, start on the BioPerl HOWTO pages, e.g. to see how 
to run BLAST from a script:
and how to read the BLAST report with a script:

There are examples that you can run and use as a starting point for your 
own scripts.
Once you have your annotation (e.g. you found a good hit to a region in 
the genome and want to annotate the region with the gene name and 
description from the hit), you could construct your genome with 
annotations using another (or even the same) BioPerl script, where you 
would build something called a Bio::Seq object, which is described here:

which you can finally write to a flat text file in EMBL or GENBANK 
format or as a GFF file. You could even decide to turn that into a 
database and run a local copy of some genome browser off that. Plenty of 
options (this can often be one of the main problems..)
The best way of getting help is to start scripting and when it fails, 
post your script along with a specific question here. If it is not a 
BioPerl but a general Perl question, you should also check out the 
really helpful community on perlmonks.org.
Hope this helps. Good luck!


On 24/01/12 14:46, Bradyjoel wrote:
> Hi Roy,
> Thank you for your quick reply, I already tried xbase and it finds some of
> the exopolysacharide biosynthesis genes that I’m looking for but only the
> genes that are already annotated in the other organisms. I also tried to
> merge these results but it still misses some of the genes  or the correct
> annotation and merging also cost a lot of time. Since I am only looking for
> a certain set of genes, I thought it might be easier to use a certain script
> that can blast these these protein queries and add the annotation at the
> locations were it finds simularity in my sequence. I already tried to make a
> script myself but I'm still troubled how to connect the output of ablast to
> the action of adding the gene information and write it to a certain file.
> Joel
> Roy Chaudhuri-3 wrote:
>> Hi Joel,
>> This is possible using BioPerl, but it may be simpler to use an online
>> automated annotation service eg:
>> http://www.xbase.ac.uk/annotation
>> Roy.
>> On 24/01/2012 13:38, Bradyjoel wrote:
>>> Hi all,
>>> I'm somewhat new to the bioinformatics but I need to annotate a newly
>>> sequenced bacterial genome based on some known proteins amino acid
>>> sequences
>>> from a neighbouring organism. I've been doing this manually with tblastn
>>> and
>>> then search and annotated this in artemis. However I have an entire
>>> directory full of these protein sequences and was wondering if this could
>>> be
>>> automated in such way that I have an input nucleotide sequence consisting
>>> of
>>> contigs which are automatically translated in frames and then aligned and
>>> annotated with the known protein amino acid sequences. If you could help
>>> me
>>> with writing such a script I would be very grateful.
>>> Joel
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

More information about the Bioperl-l mailing list