Mark Nadel | 27 Nov 21:38 2013

Possible bug in Bio::Restriction::Analysis

I have been using Bio::Restriction::Analysis and have noticed discrepancies
between the results on cut positions using it and some other tools.
Checking the results by hand led me to believe that there is an error
in Bio::Restriction::Analysis.
(In the case of a restriction enzyme that cuts both strands, but at
slightly different bases, the specification of which position should be
reported is not clear, but I will try to avoid that issue in this report.)

I will use an e coli strain U00096.3 as the sequence.
The first example will be with restriction enzyme Bpu10I with recognition
sequence
CCTNAGC(-5/-2) which indicates the cut on the + strand after the second C
and on the minus strand between the T and the C opposite the A and G.

When I run this, one of the positions I get is 3449. (I've selected a case
in which the pattern appears on the + strand). Here is a listing of the
sequence from 3447 to 3465: CCCGTCAGCCTGAGCTTGC
As you can see, the recognition pattern begins at 3455, and the cut on the
+ strand would be at 3456 which is what is reported by, for example
CLCSequenceViewer. The - strand cut site would be at 3459.

Here is another example using  nicking enzyme Nt.Bpu10I that nicks only the
+ strand. The corresponding recognition sequence is CCTNAGC(-5/?). When I
run this I get 3454 which corresponds to the beginning of the recognition
sequence, but not the cut site.

I hope the guardians of this module will be motivated to have a look. I
will be happy to provide any additional information that may be of use. I
am using version 1.006901.

Thanks in advance,

Mark

--

-- 
*Mark Nadel*

*Principal Scientist*
Nabsys Inc.
60 Clifford Street
Providence, RI  02903

Phone   401-276-9100 x204
Fax 401-276-9122

Gmane