[Bioperl-l] [How to add features in genbank flat file]

Sebastien Moretti sebastien.moretti at igs.cnrs-mrs.fr
Thu Mar 24 06:05:27 EST 2005

No one seems to have a solution to this problem I posted a month ago.

So, I changed my mind and use 'wget' to get the GenBank sequences.
I get the full GenBank entry, with most of features.
And I can avoid another bug: COMMENT lines are not well formated with 
the BioPerl script I used (not as COMMENT lines are on NCBI), and blank 
lines are removed.

	#!/usr/bin/perl -w
	use strict;
	use diagnostics;
	use File::Cat;
	my $acc=$ARGV[0] or die "\n\tThe accession number you seek for is 
missing.\n\tTry something like: $0 NM_178432\n\n";
	`wget -O output_file.tmp 
	cat ("output_file.tmp", \*STDOUT);
	# wget -O output_file 

Sorry, I don't use BioPerl to Query GenBank (but for other applications) 
but BioPerl 1.5 has not corrected the COMMENT bug and the missing features.

> Hello,
> I saw that Genbank web site have changed:
> Now, features like 'SNPs' are no more included in the EST flat files.
> At the NCBI web site, we must click on 'features: SNP' to add them in our flat 
> file.
> With BioPerl, 1.4 or 1.5, it's the same, the variation features are no more 
> included in the EST flat files that I upload.
> Here is the script I use:
> 	#!/usr/bin/perl -w
> 	use strict;
> 	use Bio::DB::GenBank;
> 	use Bio::DB::Query::GenBank;
> 	use Bio::SeqIO;
> 	my $acc=$ARGV[0] or die "\n\tThe accession number you seek for is missing.
> \n\tTry something like: $0 NM_178432\n\n";
> 	$acc=$acc."[Accession]";
> 	my $query_string = "$acc";
> 	my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
> 	                                                 -query=>$query_string);
> 	my $gb = new Bio::DB::GenBank;
> 	my $stream = $gb->get_Stream_by_query($query);
> 	my $out=Bio::SeqIO->new(-format=>'genbank');
> 	my $seq = $stream->next_seq();
> 	my $result=$out->write_seq($seq);
> 	$result =~ s/^1.*$//;
> 	#print $out->write_seq($seq);
> 	print $result;
> 	exit;
> How can I add most of features to my nucleotide flat files ?
> Thanks

Sébastien Moretti
31 chemin Joseph Aiguier
13402 Marseille cedex

More information about the Bioperl-l mailing list