Bioperl-guts: Heads up for 0.05

Ewan Birney birney@sanger.ac.uk
Mon, 8 Mar 1999 09:05:22 +0000


On Sun, 7 Mar 1999, Aaron J Mackey wrote:

> 
> I'll take care of writing tests for the DB stuff, in which I'll check for
> the ability to open a socket (i.e. use the internet).  I believe that the
> DB implementations currently throw if they can't get a socket to use, but
> I'll double check and make sure.

That is something to check.

Ideally, if there is not internet connection, the test script should
print out "Can't test DB/GenBank and DB/GenPept" because of no internet
connection, *but still pass the test*. 

>From experience, I know that even if you say "test 12 will fail if no
internet connection" people will get frantic if any test fails, even if
they are told to expect it

(users eh?)

> 
> I'll also write a SeqIO::Genbank and SeqIO::ASN1.1, but with Seq.pm that's
> not much.  What's the status with SeqFeature/Entry/WhateverItsCalledNow,
> is someone working on it?  I think having SeqIO streams for fancy formats
> would be great, but only if we can actually get at the information
> contained within.  (Note:: currently DB::GenBank/Pept is only getting the
> fasta-formatted data from ncbi, and populating a Seq.pm object - this is
> an area where a SeqIO::Genbank -> Entry.pm object would be useful).
> 

I am working on it. Well - designing it - (take a look at
Projects/SeqAnnot on the web page). This still needs quite alot of
thought, although Steve and I reckon that we have quite a good angle on
the overall design

It will not be in 0.05 sadly. I hope it will be in by 0.06. But I agree
it is a real necessity. Aaron - if you want to list your requirements
for the object that would be great. I *don't* think that we should
necessarily just have everything that is in ncbi/embl databases, but that
is a starting point. Functional requirements (I want to be able to do xxx
to the object) is also really helpful.


> On another note, we need to make sure that people don't use DB::Genbank to
> send multiple successive requests to ncbi (or any other web server), but
> rather to use batch-query mode (not currently implemented, but I'm working
> on it).  I got spanked earlier this week for stupidly slamming the ncbi
> server with >2000 hits/hour.  Mea culpa, mea culpa ...
> 

I don't know how we stop this. (do you mean, having a counter for each
request in a session issued, and start throwing warnings after 100
get requests)?


> As far as interfacing DB-retrieved Seq's  with Index'ed databases, I'd
> suggest to leave it to the user:
> 
> IndexedDB->add_seq(DB::Genbank->get_Seq_by_id("FOO_HUMAN"));

The indexing doesn't work this way. It is more about whether this call
works

  IndexDB->get_Seq_by_id("FOO_HUMAN");


> 
> (I haven't actually looked at the Index stuff yet, so forgive me if my
> syntax is inaccurate, but you get the idea).
> 
> DB is a one way street (stream?) -> ask for Seq, get Seq.  What you want
> to do with it is up to the user (crunch it, fragment it, index it, etc).
> 
> -Aaron
> 
>  o ~   ~   ~   ~   ~   ~  o
> / Aaron J Mackey           \
> \  Dr. Pearson Laboratory  / 
>  \ University of Virginia  \     
>  /  (804) 924-2821          \
>  \  amackey@virginia.edu    /
>   o ~   ~   ~   ~   ~   ~  o
> 
> 

Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl-guts.html
====================================================================