[Bioperl-l] arabidopsis + load_seqdatabase.pl
osborne1 at optonline.net
Tue Dec 20 12:35:14 EST 2005
>I want them to be correctly parsed.
They have been correctly parsed but you're looking in the wrong place. The
names and identifiers associated with things like "CDS" or "gene" will not
be found in the Bioentry table. The Bioentry is the entire NC_* record, the
genes, mRNAs, and proteins are called features. Read the Feature-Annotation
HOWTO and doc/schema-overview.txt in the biosql package.
On 12/19/05 5:39 PM, "Angshu Kar" <angshu96 at gmail.com> wrote:
> I've tried .faa, .fna and .gbk files in the link mentioned below. After
> running the script when I saw the loaded database, I saw that in the
> bioentry table the 3 fields accession, identifier and name containing the
> same data.Also, the version column was not populated. I want them to be
> correctly parsed. So I want an arabidopsis data file that "goes well" with
> the load_seqdatabase.pl script.
> On 12/19/05, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>> On 12/19/05 3:20 PM, "Angshu Kar" <angshu96 at gmail.com> wrote:
>>> I've used files from
>>> ftp://ftp.ncbi.nih.gov/genomes/Arabidopsis_thaliana/CHR_V . But the
>>> cannot parse them according to biosql-schema.
>>> So, I want some files that the script can parse correctly.
>>> Else, I've to load each and every file onto the biodb and then check
>>> it has been parsed correctly!
>> Which file are you trying to load? What format is it in? What values are
>> you expecting to be loaded that aren't? For the answer to the last
>> question, it will likely help folks to see exactly what line of the input
>> file isn't being loaded as you think it should be. For example, if there
>> a line in a file that contains
>> foo /note="bar"
>> Then you can point out that you would like to know where, if at all, the
>> annotation associated with the foo tag is stored.
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
More information about the Bioperl-l