[Bioperl-guts-l] bioperl commit

Brian Osborne bosborne at pub.open-bio.org
Thu May 20 08:44:20 EDT 2004


bosborne
Thu May 20 08:44:20 EDT 2004
Update of /home/repository/bioperl/bioperl-live/Bio/DB/Flat
In directory pub.open-bio.org:/tmp/cvs-serv4373

Modified Files:
	BinarySearch.pm 
Log Message:
Using single quotes simplifies use of regex, some other POD edits

bioperl-live/Bio/DB/Flat BinarySearch.pm,1.11,1.12
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/DB/Flat/BinarySearch.pm,v
retrieving revision 1.11
retrieving revision 1.12
diff -u -r1.11 -r1.12
--- /home/repository/bioperl/bioperl-live/Bio/DB/Flat/BinarySearch.pm	2004/05/20 02:56:47	1.11
+++ /home/repository/bioperl/bioperl-live/Bio/DB/Flat/BinarySearch.pm	2004/05/20 12:44:20	1.12
@@ -35,8 +35,8 @@
 Patterns have to be entered to define where the keys are to be indexed
 and also where the start of each record.  E.g. for fasta
 
-    my $start_pattern   = "^>";
-    my $primary_pattern = "^>(\\S+)";
+    my $start_pattern   = '^>';
+    my $primary_pattern = '^>(\S+)';
 
 So the start of a record is a line starting with a E<gt> and the
 primary key is all characters up to the first space after the E<gt>
@@ -60,7 +60,10 @@
 
 The index is now ready to use.  For large sequence files the perl way
 of indexing takes a *long* time and a *huge* amount of memory.  For
-indexing things like dbEST I recommend using the C indexer.
+indexing things like dbEST I recommend using the DB_File indexer, BDB.
+
+The formats currently supported by this module are fasta, Swissprot,
+and EMBL.
 
 =head2 Creating indices with secondary keys
 
@@ -92,13 +95,13 @@
 
     my %secondary_patterns;
 
-    my $start_pattern   = "^ID   (\\S+)";
-    my $primary_pattern = "^AC   (\\S+)\;";
+    my $start_pattern   = '^ID   (\S+)';
+    my $primary_pattern = '^AC   (\S+)\;';
 
-    $secondary_patterns{"ID"} = "^ID   (\\S+)";
+    $secondary_patterns{"ID"} = '^ID   (\S+)';
 
     my $index = new Bio::DB::Flat::BinarySearch(
-                -directory          => ".",
+                -directory          => $index_directory,
 		-dbname             => "ppp",
 		-write_flag         => 1,
                 -verbose            => 1,
@@ -109,8 +112,8 @@
 
     $index->build_index($seqfile);
 
-Of course having secondary indices makes indexing slower and more 
-of a memory hog.
+Of course having secondary indices makes indexing slower and use more
+memory.
 
 =head2 Index reading
 
@@ -147,9 +150,12 @@
 
     $index->secondary_namespaces("ID");
 
-Then the following calls can be used
+Then the following call can be used
 
     my $seq   = $index->get_Seq_by_secondary('ID','1433_CAEEL');
+
+These calls are not yet implemented
+
     my $fh    = $index->get_stream_by_secondary('ID','1433_CAEEL');
     my $entry = $index->get_entry_by_secondary('ID','1433_CAEEL');
 
@@ -237,7 +243,8 @@
  Function: create a new Bio::DB::Flat::BinarySearch object
  Returns : new Bio::DB::Flat::BinarySearch
  Args    : -directory          Root directory for index files
-           -dbname             Name of subdirectory containing indices for named database
+           -dbname             Name of subdirectory containing indices 
+                               for named database
            -write_flag         Allow building index
            -primary_pattern    Regexp defining the primary id
            -secondary_patterns A hash ref containing the secondary



More information about the Bioperl-guts-l mailing list