Téletchéa Stéphane | 23 Apr 18:10 2013
Picon

Problem using column_from_residue_number

Dear bioperlers,

I am facing a problem I used to encounter in the past and failed to 
report it properly.

I am either mis-using it or there is a bug, I would like your feedback.

Consider the alignment between sequence A and B in test.ali,
and the bioperl_test.pl as examples (attached to this mail).

I would like to know at position i what is the amino acid of seq A,
and what is the amino acid in position i in sequence B (of course these 
residues may not be aligned).

The problem is:

a) I cannot parse the first residue of the alignment, it has to start at 
1, otherwise I have:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Second argument residue number missing
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:472
STACK: Bio::SimpleAlign::column_from_residue_number 
/usr/share/perl5/Bio/SimpleAlign.pm:2626
STACK: ./bioperl_test.pl:19
-----------------------------------------------------------

b) Although, if I understand correctly, this function should return only 
the residues,
sometimes I have a gap inserted, so the function is not working properly:

./bioperl_test.pl |grep '-'
P:11 for seq 1 and -:30 for seq 2
S:76 for seq 1 and -:96 for seq 2
-:124 for seq 1 and L:146 for seq 2
-:150 for seq 1 and K:170 for seq 2
-:175 for seq 1 and I:190 for seq 2
-:202 for seq 1 and L:215 for seq 2
-:224 for seq 1 and G:233 for seq 2
W:264 for seq 1 and -:263 for seq 2
S:296 for seq 1 and -:297 for seq 2
-:343 for seq 1 and S:345 for seq 2
-:369 for seq 1 and H:361 for seq 2

Can we discuss it here or I have to open a bug report?

Thanks in advance,

Stéphane

-- 
Equipe DSIMB - Dynamique des Structures et
des Interactions des Macromolécules Biologiques
INTS, INSERM-Paris-Diderot UMR-S665
6 rue Alexandre Cabanel - 75739 Paris cedex 15- France
Tél : +33 144 493 057
Fax : +33 147 347 431
http://www.dsimb.inserm.fr / http://steletch.free.fr

CLUSTAL 2.1 multiple sequence alignment

sp|Q16570|DUFFY_HUMAN      MGNCLHRAELSPSTENSSQLDFEDVWNSSYGVNDSFPDGDYGANLEAAAP
sp|P61073|CXCR4_HUMAN      -------------------MEGISIYTSDN-YTEEMGSGDYDSMKEPCFR
                                              ::  .::.*.   .:.: .***.:  *..  

sp|Q16570|DUFFY_HUMAN      CHSCNLLDDSALPFFILTSVLGILASSTVLFMLFRPLFRWQLCPGWPVLA
sp|P61073|CXCR4_HUMAN      EENANFNKIFLPTIYSIIFLTGIVGNGLVILVMGYQKKLRSMTDKY--RL
                            ...*: .    .:: :  : **:... *::::       .:   :    

sp|Q16570|DUFFY_HUMAN      QLAVGSALFSIVVPVLAPGLGSTR--SSALCSLGYCVWYGSAFAQALLLG
sp|P61073|CXCR4_HUMAN      HLSVADLLFVITLPFWAVDAVANWYFGNFLCKAVHVIYTVNLYSSVLILA
                           :*:*.. ** *.:*. * .  :.   .. **.  : ::  . ::..*:*.

sp|Q16570|DUFFY_HUMAN      -----CHASLGHRLGAGQVPGLTLG--LTVGIWGVAALLTLPVTLASGAS
sp|P61073|CXCR4_HUMAN      FISLDRYLAIVHATNSQRPRKLLAEKVVYVGVWIPALLLTIPDFIFANVS
                                 : :: *  .: :   *     : **:*  * ***:*  : :..*

sp|Q16570|DUFFY_HUMAN      GG----LCTLIYSTELKALQATHT----------VACLAIFVLLPLGLFG
sp|P61073|CXCR4_HUMAN      EADDRYICDRFYPNDLWVVVFQFQHIMVGLILPGIVILSCYCIIISKLSH
                            .    :*  :*..:* .:   .           :. *: : ::   *  

sp|Q16570|DUFFY_HUMAN      AKGLKKALGMGPGPWMNILWAWFIFWWPHGVVLGLDFLVRSKLLLLSTCL
sp|P61073|CXCR4_HUMAN      SKGHQKRKALKTT--VILILAFFACWLPYYIGISIDSFILLEIIKQG-CE
                           :** :*  .: .   : :: *:*  * *: : :.:* ::  :::  . * 

sp|Q16570|DUFFY_HUMAN      AQQALDLLLNLAEALAILHCVATPLLLALFCHQATRTLLPSLP-------
sp|P61073|CXCR4_HUMAN      FENTVHKWISITEALAFFHCCLNPILYAFLGAKFKTSAQHALTSVSRGSS
                            ::::.  :.::****::**  .*:* *::  : . :   :*.       

sp|Q16570|DUFFY_HUMAN      ---LPEGWSSHLDTLGSKS--------
sp|P61073|CXCR4_HUMAN      LKILSKGKRGGHSSVSTESESSSFHSS
                              *.:*  .  .::.::*        

Attachment (bioperl_test.pl): application/x-perl, 803 bytes
_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Gmane