[Bioperl-l] arabidopsis + load_seqdatabase.pl

Angshu Kar angshu96 at gmail.com
Mon Dec 19 15:20:44 EST 2005


I've used files from
ftp://ftp.ncbi.nih.gov/genomes/Arabidopsis_thaliana/CHR_V . But the
script cannot parse them according to biosql-schema.
So, I want some files that the script can parse correctly.
Else, I've to load each and every file onto the biodb and then check whether
it has been parsed correctly!


On 12/19/05, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On 12/19/05 2:10 PM, "Angshu Kar" <angshu96 at gmail.com> wrote:
> > Hi Sean,
> >
> > What I need is precisely the latest arabidopsis files (peptide as well
> as dna)
> > that has loaded the database successfully when used with the
> > load_seqdatabase.pl script.
> > I've tried some other files but they doesn't load all the tables
> correctly
> > (e.g. cannot distinguish between accession #, name and identifier etc
> and load
> > same data in all the 3 columns).
> I might approach this in a different way.  I would seek to find the file
> or
> files that contain all the information that I want to store--this is the
> hard part in this case, perhaps.  If the data comes from TAIR (that looks
> to
> be a good source of genome information for arabidopsis), then you need to
> learn what files are there, what format they are in, what is in each of
> them, and what isn't.  Then, and only then, should you try to load the
> data
> into a database.  Only then can you determine what the problem is (if
> there
> is one) with loading data into bioperl-db.  Imagine, for example, that the
> datafile that you are trying to load includes only an accession.  In that
> case, bioperl-db can't load other information, because there isn't any to
> load.  So, you need to diagnose your own problem here, I think and
> determine
> what is in the files that you have and why you have the situation in the
> database you have.
> So, what format file do you have right now and does bioperl support it?
> What is expected to be in that file?  Is everything that you need in the
> files that you have (you have to look at the files and understand them,
> not
> at the bioperl parsing of them)?
> Sean

More information about the Bioperl-l mailing list