[Bioperl-guts-l] Notification: incoming/830

bioperl-bug-admin@bioperl.org bioperl-bug-admin@bioperl.org
Wed, 9 Aug 2000 11:45:53 -0400


JitterBug notification

new message incoming/830

Message summary for PR#830
	From: schattner@alum.mit.edu
	Subject: No minimal/optional matching in SeqPattern
	Date: Wed, 9 Aug 2000 11:45:52 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From schattner@alum.mit.edu  Wed Aug  9 11:45:52 2000
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.9.3/8.9.3) with ESMTP id LAA05780
	for <bioperl-bugs@pw600a.bioperl.org>; Wed, 9 Aug 2000 11:45:52 -0400
Date: Wed, 9 Aug 2000 11:45:52 -0400
From: schattner@alum.mit.edu
Message-Id: <200008091545.LAA05780@pw600a.bioperl.org>
To: bioperl-bugs@bioperl.org
Subject: No minimal/optional matching in SeqPattern

Full_Name: Peter Schattner
Module: SeqPattern.pm
Version: 1.2.6.2
OS: Linux
Submission from: c263571-a.smateo1.sfba.home.com (24.176.147.38)



SeqPattern.pm does not handle "?" quantifiers. 
Consequently it is unable to reverse-complement
patterns for minimal ("non-greedy") or optional 
matching, e.g.

[peter@pschattner examples]$ perl -w seq_pattern.pl -n 'acc?'

Nucleotide Pattern:
-----------------------
              Type: Dna
          Original: acc?
          Expanded: acc?
      Reverse-Comp: ?ggt
 Rev-Comp+Expanded: ?ggt


Adding the following subroutine around line 806
(after the subroutine _fixpat_5)enables the
program to handle minimal matching.

############################
#
#  PS: Added 8/7/00 to allow non-greedy matching patterns
#
######################################

=head1 _fixpat_6

 Title     : _fixpat_6
 Usage     : n/a; called automatically by revcom()
 Purpose   : Utility method for revcom()
           : Converts all ?Y{5,7}  ---> Y{5,7}?
           :          and ?(XXX){5,7}  ---> (XXX){5,7}?
           :          and ?[XYZ]{5,7}  ---> [XYZ]{5,7}?
 Returns   : String (the new, partially reversed pattern)
 Argument  : String (the expanded, partially reversed pattern)
 Throws    : n/a

See Also   : L<revcom>()

=cut

#--------------
sub _fixpat_6 {
#--------------
    my $pat = shift;
    my (@done,@parts);

   @done = ();
    while(1) {
	$pat =~   /(.*)\?(\[\w+\]|\(\w+\)|\w)(\{\S+?\})?(.*)/ or do{ push @done, $pat;
last; };
     my $quantifier = $3 ? $3 : ""; # Shut up warning 
 	$pat = $1.'#'.$2.$quantifier.'?'.$4;
	@parts = split '#', $pat;
	push @done, $parts[1];
	$pat = $parts[0];
	last if not $pat;
    }
    return join('', reverse @done);

 }

--------------------

In addition, the following line is needed at 
line 536 to call the routine:

$fixrev = _fixpat_6($fixrev);  #ps Handle minimal matching