1 Mar 2004 02:20
Re: Hard Times using File Inputs for HMM Package
<mark.schreiber <at> group.novartis.com>
2004-03-01 01:20:21 GMT
2004-03-01 01:20:21 GMT
Hi - Possible guesses about what might be wrong: 1) You haven't created Symbols for your Alphabet 2) You haven't added said Symbols to your Alphabet This page http://www.biojava.org/docs/bj_in_anger/customAlpha.htm shows how to make a custom Alphabet. It may be useful. Hope this helps, - Mark ps Just wondering, why do you need a custom Alphabet for Protein??? There is a perfectly good one in ProteinTools.getAlphabet(). sacoca <at> mcb.mcgill.ca Sent by: biojava-l-bounces <at> portal.open-bio.org 02/29/2004 01:08 AM To: biojava-l <at> biojava.org cc: Subject: [Biojava-l] Hard Times using File Inputs for HMM Package Hey all, I built a markov model using the Biojava package and am having an incredibly hard time using it on sequences that I have stored in fasta format on a file. The problem is that I specified my own SimpleAlphabet, for protein sequences using the one letter amino acid code much like the dishonest casino example that you have on the tutorial page for dynamic programming, and each time I try reading the sequence all I get is : org.biojava.bio.symbol.IllegalSymbolException: Symbol G not found in alphabet ProtAlphabet at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.LinearAlphabetIndex.indexForSymbol(LinearAlphabetIndex.java:117) at org.biojava.bio.dist.SimpleDistribution.getWeightImpl(SimpleDistribution.java:131) at org.biojava.bio.dist.AbstractDistribution.getWeight(AbstractDistribution.java:197) at org.biojava.bio.dp.ScoreType$Probability.calculateScore(ScoreType.java:48) at org.biojava.bio.dp.onehead.SingleDP.getEmission(SingleDP.java:100) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:553) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:488) I've tried building a parser with CharacterTokenization such as Parser = new CharacterTokenization(ProtAlphabet,false) and then bidning each symbol to the proper character for(int i=0; i<Protein.length;i++) Parser.bindSymbol(Protein[i], AAC[i]); and then building a symbol list SymbolList Bcl2SequenceList = new SimpleSymbolList(Parser,ProtSequence); but nothing works. By the way, I've also tried using the SeqIOTools to read the file but the same error was generated. Symbol X was not found in alphabet ProtAlphabet. HELP!!!!!! _______________________________________________ Biojava-l mailing list - Biojava-l <at> biojava.org http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l <at> biojava.org http://biojava.org/mailman/listinfo/biojava-l
RSS Feed