Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Scott Cain <scott <at> scottcain.net>
Subject: Re: problem with bp_genbank2gff.pl
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Saturday 30th November 2013 05:07:50 UTC (over 3 years ago)
If all you are doing is converting genbank files to GFF, you don't need GD
orGraphViz, and if you find out later that you do want them, you can
install them then. The are optional, not required. 

Scott


Sent from my iPhone

On Nov 30, 2013, at 12:04 AM, LI ZHOU  wrote:

> Hi,
> Thank you for your answers.
> I am now trying to install the latest version (BioPerl-1.6.922) on
another computer. 
> I need to install some missing modules. But theres are some errors when I
try to install GD and GraphViz.
> The following are the error message:
> 
> >perl Build installdeps
> Checking optional dependencies:
> Install GD? [y] y
> Install GraphViz? [y] y
> CPAN: Storable loaded ok (v2.41)
> Reading '/Users/zhouli/.cpan/Metadata'
>   Database was generated on Wed, 13 Nov 2013 01:17:02 GMT
> Running install for module 'GD'
> Running make for L/LD/LDS/GD-2.50.tar.gz
> CPAN: Digest::SHA loaded ok (v5.85)
> CPAN: Compress::Zlib loaded ok (v2.063)
> Checksum for
/Users/zhouli/.cpan/sources/authors/id/L/LD/LDS/GD-2.50.tar.gz ok
> CPAN: File::Temp loaded ok (v0.2304)
> CPAN: Parse::CPAN::Meta loaded ok (v1.4409)
> CPAN: CPAN::Meta loaded ok (v2.132830)
> CPAN: Module::CoreList loaded ok (v2.96)
> 
>   CPAN.pm: Building L/LD/LDS/GD-2.50.tar.gz
> 
> Notice: Type perl Makefile.PL -h for command-line option summary.
> 
> **UNRECOVERABLE ERROR**
> Could not find gdlib-config in the search path. Please install libgd
2.0.28 or higher.
> If you want to try to compile anyway, please rerun this script with the
option --ignore_missing_gd.
> Warning: No success on
command[/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/bin/perl
Makefile.PL]
> CPAN: YAML loaded ok (v0.84)
>   LDS/GD-2.50.tar.gz
>   /Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/bin/perl Makefile.PL --
NOT OK
> Running make test
>   Make had some problems, won't test
> Running make install
>   Make had some problems, won't install
> Could not read metadata file. Falling back to other methods to determine
prerequisites
> Running install for module 'GraphViz'
> Running make for R/RS/RSAVAGE/GraphViz-2.14.tgz
> Checksum for
/Users/zhouli/.cpan/sources/authors/id/R/RS/RSAVAGE/GraphViz-2.14.tgz ok
> CPAN: Module::Build loaded ok (v0.42)
> 
>   CPAN.pm: Building R/RS/RSAVAGE/GraphViz-2.14.tgz
> 
> Please install Graphviz from http://www.graphviz.org/.
> Warning: No success on
command[/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/bin/perl Build.PL]
>   RSAVAGE/GraphViz-2.14.tgz
>   /Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/bin/perl Build.PL -- NOT
OK
> Running Build test
>   Make had some problems, won't test
> Running Build install
>   Make had some problems, won't install
> 
> I am new to computational biology. I would appreciate if you could offer
any help.
> Thank you very much!
> Regards,
> Zhou Li
> 
> 
> On 30 Nov, 2013, at 12:55 PM, Scott Cain  wrote:
> 
>> Also, about your questions (which were buried very deep at the end of
the email):
>> 
>> 1. The script supports a --split option to split GFF from fasta. 
>> 
>> 2. I don't think there is an option to specifically skip attributes like
translation (it shows up in the GFF because it's in the genbank file), but
it probably wouldn't be to hard to add that functionality. Patches welcome.

>> 
>> 3. I don't know anything about igv. 
>> 
>> Scott
>> 
>> 
>> Sent from my iPhone
>> 
>> On Nov 29, 2013, at 3:34 PM, "Fields, Christopher J"
 wrote:
>> 
>>> The latest CPAN release (v 1.6.922) fixes these issues.
>>> 
>>> chris
>>> 
>>> On Nov 28, 2013, at 1:08 AM, zhou li
<[email protected]> wrote:
>>> 
>>> Dear Bioperl people,
>>> I am using BioPerl-1.6.1, and the operating system is Mac OS X version
10.8.5.
>>> I am trying to convert a local GenBank file to GFF file using
bp_genbank2gff.pl, using the following command,
>>> $ bp_genbank2gff.pl M21017.gb --stdout > M21017.gff3
>>> And I got the following message, I am not sure if this is an error:
>>> Replacement list is longer than search list at
/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/lib/site_perl/5.18.1/Bio/Range.pm
line 251.
>>> UNIVERSAL->import is deprecated and will be removed in a future perl at
/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/lib/site_perl/5.18.1/Bio/Tree/TreeFunctionsI.pm
line 94.
>>> # working on region:M21017, Drosophila melanogaster, 09-MAY-1994,
D.melanogaster 18S, 5.8S 2S and 28S rRNA genes, complete, and 18S rRNA
gene, 5' end, clone pDm238.
>>> 
>>>
***************************************************************************
>>> And the output file M21017.gff3 is attached.
>>> 
>>> $head M21017.gff3
>>> ##gff-version 3
>>> M21017 Genbank region 1 12026 . . .
ID=M21017;Note=D.melanogaster%2018S%2C%205.8S%202S%20and%2028S%20rRNA%20genes%2C%20complete%2C%20and%2018S%20rRNA%20gene%2C%205%27%20end%2C%20clone%20pDm238.;Alias=M29800
>>> M21017 Genbank region 1 12026 . + .
ID=Drosophila%20melanogaster;db_xref=taxon%3A7227;mol_type=genomic%20DNA
>>> M21017 Genbank gene 1 12026 . + . ID=18S%20rRNA
>>> M21017 Genbank RNA 1 7232 . + .
ID=18S%20rRNA;note=rRNA%20primary%20transcript
>>> M21017 Genbank rRNA 1 1995 . + .
ID=18S%20rRNA;product=18S%20ribosomal%20RNA
>>> M21017 Genbank gene 2722 2844 . + . ID=5.8S%20rRNA
>>> M21017 Genbank rRNA 2722 2844 . + .
ID=5.8S%20rRNA;product=5.8S%20ribosomal%20RNA
>>> M21017 Genbank gene 2873 2902 . + . ID=2S%20rRNA
>>> M21017 Genbank rRNA 2873 2902 . + .
ID=2S%20rRNA;product=2S%20ribosomal%20RNA
>>> 
>>> 
>>> When I test another genbank file
>>> $ bp_genbank2gff.pl WSSV-AF369029-GenBank.gb --stdout >
WSSV-AF369029-GenBank.gff3
>>> I also got the error message:
>>> Replacement list is longer than search list at
/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/lib/site_perl/5.18.1/Bio/Range.pm
line 251.
>>> UNIVERSAL->import is deprecated and will be removed in a future perl at
/Users/zhouli/perl5/perlbrew/perls/perl-5.18.1/lib/site_perl/5.18.1/Bio/Tree/TreeFunctionsI.pm
line 94.
>>> $ head WSSV-AF369029-GenBank.gff3
>>> ##gff-version 3
>>> AF369029 Genbank region 1 292967 . . .
ID=AF369029;Alias=AY864671;Note=White%20spot%20syndrome%20virus%2C%20complete%20genome.
>>> AF369029 Genbank region 1 292967 . + .
ID=White%20spot%20syndrome%20virus;mol_type=genomic%20DNA;isolate=WSSV-TH;country=Thailand;db_xref=taxon%3A342409
>>> AF369029 Genbank gene 1 615 . + .
ID=VP28;experiment=experimental%20evidence%2C%20no%20additional%20details%20recorded;note=envelope%20protein
>>> AF369029 Genbank CDS 1 615 . + .
Parent=VP28.t00;translation=MDLSFTLSVVSAILAITAVIAVFIVIFRYHNTVTKTIETHTDNIETNMDENLRIPVTAEVGSGYFKMTDVSFDSDTLGKIKIRNGKSDAQMKEEDADLVITPVEGRALEVTVGQNLTFEGTFKVWNNTSRKINITGMQMVPKINPSKAFVGSSNTSSFTPVSIDEDEVGTFVCGTTFGAPIAATAGGNLFDMYVHVTYSGTETE;db_xref=GI%3A15021393;protein_id=AAK77670.1;product=ORF1%2C%20VP28%2C%20gene%20family%201;note=envelope%20protein;codon_start=1
>>> AF369029 Genbank CDS 710 2902 . - .
Parent=AAK77671.1.t00;translation=MEGGDQRTKLTPATVMGLYQSKTPGEGEGGEGGGQFKIPSAIAVKSCCSKNATRRSPPSDSPYSLRPMKRLKKNNGEVGGKAPPPVTLRLREDYESTPYNFNRNKKKRPITIDENQFATLNPTYATDIIKKQQLPSVSAASVLRKHRANADTQYRKRFSHPNCAKFSTVNLKARDYTPLSVLRSHVKGPKHLKSSCDTVTETNVVKRNFSSIDKWVKLEKPPCYFAVAEADTNIAAGLESPFHLIRQAAKLGLISDVQDVSSNYETIKQSCIDAKEKASKFLWSNNRTKQPPSSWWPVGFGSKNLSVLDTSPLLNWNRLCKNNGKGWIKTMSIDHMAKNVFKLSPGACESILEKKTTLLGEVTAQCKKWESYRRNIPVPAHVQPEYASQVVMIGPSELYLEVKVGVYYMLETGKVIKFMTDKEMYCEFVFETVFSHALEGRMKGAVGVRKMCVEGFCVEMDFAGISVIDVLNGDLKCKMDENVVQQPNPSTTSSKPAAELMQDHGSLCRMRDTLYGVRMLQATGRLPEGLQSKCKKPITDSISAIAIVGKMRERMLNQLPFVLVEIVNIVTRLSQQGLVNPDIKSDNIVIDGITGQPKMIDFGLIVPCKKYYNFKCWGTDERFFSNHPHTAPEFINSELCSETAMTFGLAYLLIDMLSILIKRTADLSANSIYTNIPFLSIVSKMYDQEKTNRPRAYEIAPVIGACFPFKDNIAKLFQSPKHSLYS
 KKVK;db_xref=GI%3A15021394;codon_start=1;product=ORF2%2C%20putative%20serine%2Fthreonine%20protein%20kinase%20%28PK1%29%2C%20gene%20family%202
>>> AF369029 Genbank CDS 3118 4989 . - .
Parent=AAK77672.1.t00;codon_start=1;product=ORF3;db_xref=GI%3A15021395;translation=MAWTVMALKDAFTERLVVNKVGSGTDMAPVVEDDRQKSLFQKVENLYRVLVVEQKNSAITLSGNKNTNKRQCRQVEEDKVIFEGEDRTVSNLPQAVKETIAANAESILDYWYKNVIPLLDTKKERSGKSDTFLRTAVICLVRCCVSYKDMKTCSLIYEFEHKILNKSTLDPLLKDILDNKQELLHMDSKYGSKTTSPELAKETIEALYTTVYNHWTNAFKLYQASLTHKPVTGKKYASVIHFIRTWRKIVKAYVSKHNNVERDLSLKNIMKNESADNANVLTIEKMYKKIGNSVKNTNNNSAHQMSDSEDDDDDDDDDCEGMDVCDEASEREKKHQESLYPINTPVTTITGDYIFKVLLELVLSPHIHPEWKIPMCDFVNRNIPKLMKAMETDISNAVIEVRASKVNPVQILPIAANFWDFCKSGKPPSDVKFCMMFNEPSSNETLSSGAGVFGRFIGGPFSHKSKELDIISNCLRSLLLNKEADNLSTRIWREGGSVVCFNYCPITARGAVLGYGEQLSERSIKALWAKKIQDAVTESVKRQRNAADKNSRNCDLLGDEGVVSMKTVTFGCANMLKTQNGMGKFNVVVSFEDSIQANKEGAARQYMSQQVFTHSFPALDQGK
>>> 
>>> The output file is so tedious, the translation is all showing up. But
to me, it is not needed.
>>> 1. Is there any way to make the output file more succinct without
having the translation included?
>>> 2. Also, is there any way to split the output file to two files, one is
the GFF3 file and the other one is DNA fasta sequence file?
>>> 3. When I import the WSSV-AF369029-GenBank.gff3 file to IGV, it
displays the protein ID if there is no gene name for the sequence, e.g.
those with feature "CDS" display the protein ID, and those with feature
"gene" display gene ID, is this the way it works? I want to display the ORF
ID, what should I do?
>>> 
>>> 
>>> 
>>> [TLL Conference]<http://conference.tll.org.sg/>
>>> 
>>> TLL is organizing an international conference on Next Generation
Genomic
>>> View on Plants, Animals and Microbes on March 5th to 7th, 2014. For
more
>>> information, please visit http://conference.tll.org.sg<http://conference.tll.org.sg/>
>>>
--------------------------------------------------------------------------------
>>> Information in this email is confidential and may also be privileged.
It is intended
>>> solely for the person to whom it is addressed. If you are not the
intended recipient,
>>> please notify the sender, and please delete the message and any other
record of it
>>> from your system immediately.
>>> 
>>> 
>>> Your help is greatly appreciated.
>>> Thank you very much!
>>> Regards,
>>> Zhou Li
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> [email protected]
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
 
CD: 3ms