[Bioperl-l] gff_string on an HSPI object is not Bio::DB::GFF
cain at cshl.org
Fri Jan 9 10:40:20 EST 2004
On Fri, 2004-01-09 at 10:38, Jason Stajich wrote:
> Remember an HSP object is a combination of two SeqFeature objects (which
> are Bio::SeqFeature::Similarity objects.
> So when you call $hsp->gff_string you are calling $hsp->query->gff_string.
> If you want to see the gff for the target you do $hsp->hit->gff_string.
And that fixes the counter intuitive thing I just mentioned--should have
waited two minutes to hit send :-)
> See my search2gff.PLS script in scripts/utilities/search2gff.PLS for
> example usage of the object and production of Bio::DB::GFF appropriate GFF
> from a SearchIO parseable report.
> On Fri, 9 Jan 2004, Mark Wilkinson wrote:
> > Hi all,
> > I'm wondering if the gff_string call on an HSPI object is perhaps
> > backwards (or if it is Bio::DB::GFF that is backwards ). It certainly
> > appears that I get "mirror image" data from that call compared to what I
> > need for Gbrowse.
> > e.g. I blast an EST (a101) against genbank. I then take the blast
> > report and parse it until I have an HSP object in my hand. Now...
> > If I do ->gff_string on that HSP object I get this:
> > DB<14> p $hsp->gff_string
> > a101 BLASTN similarity 138 160 23 + 0 Target gi|12329259 125209 125231
> > But by Gbrowse GFF standards what I expect to see (I think) is this:
> > gi|12329259 BLASTN similarity 138 160 23 + 0 Target a101 1 200
> > I know that Gbrowse GFF is a bit weird, but before I go coding something
> > new to deal with this problem I want to make sure that my interpretation
> > of the problem is correct, and that nobody has actually coded a solution
> > already (other than my GbroweGFF ResultWriterI, which is what I am
> > working on updating right now).
> > One possibility is to modulate the output by passing an argument like
> > gff_string('query') or gff_string('hit') to indicate which of the
> > sequences you consider to be the "reference" sequence. I tried calling
> > gff_string on $HSP->query and $HSP->hit, but they have lost all
> > information about each other, so that doesn't help.
> > If anyone has a preference on how this should behave please say so. It
> > may be that we don't want BioPerl to exhibit Gbrowse GFF behaviour under
> > any circumstances, because it really is quite peculiar in the case of
> > alignment features. My opinion is that the current bioperl output is
> > more comprehensible than what Gbrowse is expecting ("Target" surely
> > means what you hit with your query, rather than your query itself...??),
> > but since Gbrowse & Bio::DB::GFF are so tightly integrated with BioPerl
> > it would probably be better to have some BioPerl way to generate the
> > output format expected by Bio::DB::GFF.
> > Also, what is the "correct" way to represent alignment features in
> > GFF3? Does ->gff_string output HSP's correctly in GFF3 format? If not,
> > then we should probably revisit this issue in its entirety.
> > Scott/Lincoln, is there a compelling reason for Gbrowse to require its
> > input in the format that it does, or could it be "flipped"?
> > Mark
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
Scott Cain, Ph. D. cain at cshl.org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Bioperl-l