George Hartzell | 2 Oct 02:29

dpalign, local for one sequence, global for the other?


I need to produce an alignment between a hunk of genomic sequence (in
the sense that it hasn't had introns edited out or anything) that's on
the order of a 1000 bases to a genome/chromosome/fragment of similar
genomic sequence.  In an ideal situation they'll be the same,
differences will come from variations in the sources (e.g. the hunk
might have been clipped out of genome revision X and the current
genome might be X+i, or the hunk might have come from a paper (who
knows where it came from...).  Nothing across species or across
evolutionary time or anything fun.  I'm happy to narrow the region of
the genome hunk down using some/an/... heuristic first to avoid
running dp against an entire chromosome.

I need the alignment to account for all of the bases in the hunk.  In
dynamic programming terms, if the hunk is along the vertical axis, the
path through the matrix would have to run from the top to the bottom
(or vice versa).  The projection of the path onto the
horizontal/genomic axis can start/end wherever.

I'd like to not write this [again] and was hoping to use the bio-ext
dpalign stuff.  I'm hopeful that "ENDSFREE" is just what I need, but
from the docs I'm not convinced that it is.  A more pessimistic
reading makes it sounds a lot like a local alignment.

Can anyone out there who's familiar with the dpalign code tell me
whether it can do what I need?  Out of the box?  With modifications?

g.

Test coverage for BioPerl now available

Hi all,

Daily-updated test coverage reports are now available for those BioPerl 
packages which make use of the Build.PL mechanism (except bioperl-db):

http://bioperl.org/test-coverage/bioperl-live/
http://bioperl.org/test-coverage/bioperl-network/
http://bioperl.org/test-coverage/bioperl-run/

These reports will help us to know the current 'quality' of the code in 
SVN for most of the BioPerl modules. This idea was started by Nathan 
Haigh and Sendu a long time ago and it was my fault to not implement on 
time the necessary script to run the process on a daily basis, so 
apologies for that.

There are still a few things to be done in order to have this working as 
it should:

- Nathan, current Devel::Cover module from CPAN doesn't include the JS 
modifications to make table columns sortable. Do you know what happened 
to the code you contributed to the author for that?

- Reports could be generated for the rest of the BioPerl packages as 
soon as they're migrated to the Build.PL infrastructure. Anyone up for that?

- bioperl-db tests require BioSQL to be setup in the webserver machine, 
and the same goes for bioperl-run's tests with ALL of its dependencies. 
The bioperl.org site is co-hosted with all of the other OBF projects and 
that machine also takes care of other things (mailing lists, etc), so I 
would like your feedback on possible workarounds to not overload the 
(Continue reading)

Don Gilbert | 30 Sep 19:21

Re: [Gmod-gbrowse] exporting contigs with CDSes, stored via Bio::DB::GFF, into individual GenBank records?


Eric,

If the Bio::DB::GFF database to Genbank submission route doesn't get you
where you want, you can also look at storing your data in a GMOD Chado
database, then using Bulkfiles to produce the Genbank Submission file set.

- Don Gilbert

Find a GenBank Submit output from Chado dbs in this tool release
http://eugenes.org/gmod/GMODTools/
  GMODTools-1.2b.zip      20-Jun-2008 

- adding (in progress) Genbank Submission table writer, 
  'bulkfiles -format=genbanktbl', with output suited
  to submit to NCBI as per these specifications
  http://www.ncbi.nlm.nih.gov/Genbank/eukaryotic_genome_submission.html

see also http://gmod.org/wiki/GMODTools
and this test case with genbank-submit output
 http://gmod.org/wiki/GMODTools_TestCase
Erich Schwarz | 29 Sep 07:56

exporting contigs with CDSes, stored via Bio::DB::GFF, into individual GenBank records?

Hi all,

    I have newly sequenced contigs, with CDS predictions, loaded 
into a Bio::DB::GFF-readable format (i.e., loaded into a MySQL 
database via Bio::DB::GFF).  I'd like to export each contig, with 
its annotated CDSes, into a single GenBank-formatted record for each 
contig (in order to be able submit this stuff to GenBank, without 
having to waste time with Sequin).  Is there some straightforward 
way of getting Bio::DB::GFF to do that?

    Some time ago, when I last had to decipher BioPerl, I came up 
with code that would let me export protein translations of the 
contigs' CDSes in GenBank format:

-------------------------------------------------------------------

#!/usr/bin/env perl

use strict;
use warnings;
use Bio::Seq;
use Bio::SeqIO;
use Bio::DB::GFF;

my $query_database = $ARGV[0];
my $dna = q{};
my $db = Bio::DB::GFF->new( -dsn => $query_database);

my $gb_file = 'example.gb';
my $seq_out = Bio::SeqIO->new( -file => ">$gb_file", -format => 'genbank', );
(Continue reading)

Matthew Schultz | 26 Sep 00:04

BioPerl Installation Help


Hi Bioperl,

I've been trying to install bioperl on my Ubuntu machine I'm new to Bioperl
and don't have much experience in unix so I'm at a loss for what to do
next.  I tried to follow the instructions on using CPAN to install Bioperl,
but the installation failed.  At first I thought it was because of the few
warnings I received, but after trying the "force install" command without
success either I'm not so sure.  I was about to try the alternate option
using Build.pl, but am not sure where the Bioperl installation should go
(or will the Build.pl script place it in the right folder?).  Any help or
advice you could give would be appreciated.  Thanks for your time.

-Matt Schultz

P.S.  Here are the failed test results:

Failed Test        Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/BioFetch_DB.t                  27    4  14.81%  8 20-21 27
t/DB.t                           78    2   2.56%  30-31
t/EMBL_DB.t                      15    3  20.00%  6 13-14
t/GuessSeqFormat.t               46    1   2.17%  11
t/InterProParser.t    2   512    47    1   2.13%  2
t/tutorial.t          2   512    21    6  28.57%  19-21
18 subtests skipped.
Failed 6/179 test scripts, 96.65% okay. 14/8122 subtests failed, 99.83% okay.
make: *** [test_dynamic] Error 255
  /usr/bin/make test -- NOT OK
Running make install
(Continue reading)

john paul | 25 Sep 17:23

Can't locate object method "get_dbxrefs"

Hello guys,

I need to pick your brain on this. I was trying to load some sequences in a
fresh RH build using bioperl-db and got the following error:

[tatedger <at> localhost biosql]$ perl load_seqdatabase.pl --host localhost
--dbuser root --dbname biosql --namespace swissprot --format swiss
/home/tatedger/tmp/uniprot_sprot.dat  --testonly
Loading /home/tatedger/tmp/uniprot_sprot.dat ...
Could not store Q4U9M9: Can't locate object method "get_dbxrefs" via package
"Bio::Ontology::Term" at
/usr/lib/perl5/site_perl/5.8.5/Bio/DB/Persistent/PersistentObject.pm line
552, <GEN0> line 70.

I have seen some posts on this regard (
http://bioperl.org/pipermail/bioperl-l/2008-April/027544.html) but it wasn't
clear to me what the solution would be.

My configuration:
- mysql version 4.1.7
- Red Hat Enterprise Linux ES release 4 (Nahant)

to install bioperl and bioperl-db I used the help found on the website
following
- cpan>install S/SE/SENDU/bioperl-1.5.2_102.tar.gz
- svn co svn://code.open-bio.org/bioperl/bioperl-db/trunk bioperl-db
- biosql schema is loaded and load_ncbi_taxonomy.pl worked fine.

bioperl-db  test 04 shows the same error.

(Continue reading)

Scott Cain | 24 Sep 23:58

genbank2gff.pl choking on CONTIG sections

Hi all,

The BioPerl script bp_genbank2gff.pl, which will either convert a
Genbank record to GFF or load it directly to a Bio::DB::GFF database,
is choking on GenBank records with CONTIG sections.  Since I don't
think these would ever be useful for generating GFF or loading into a
database (ie, the user will want to get all of the features on the
parts, not know what the parts are), is there a way to force a
Bio::DB::WebDBSeqI/Bio::DB::BioFetch to get the full record (like
specifying view=gbwithparts in the url at ncbi)?

Thanks,
Scott

--

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl <at> gmail.com
GMOD Coordinator (http://gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
Johnston, Caroline | 24 Sep 16:17

Using Storable with SeqFeatures

Hello.

I'm trying to use Storable to save a Bio::Seq object and Storable seems to be having a weird problem dealing
with freezing and thawing the code ref to Bio::SeqFeature::Generic cleanup_generic. If I change one
line (931) in that function from

    foreach my $t ( keys %{$self->{'_gsf_tag_hash'} || {} } ) {
                                                    ------
 to

    foreach my $t ( keys %{$self->{'_gsf_tag_hash'} } ) {

it works fine. I've pasted an example script at http://sial.org/pbot/32320.

Any ideas why this syntax would break Storable? Would wrapping the foreach in an if(defined
$self->{'_gsf_tag_hash'}) serve to replace the || {} ?

I get the same problem using Bio::Root::Storable.

Cxx 
George Hartzell | 23 Sep 23:13

confused about version numbers.


I recently realized that I needed a feature from
Bio::Search::Hit::GenericHit that exists in the svn trunk but isn't in
the CPAN's version bioperl-1.5.2_102.

I thought I'd just specify a version number in my Build.PL and that
way I could sleep soundly.

But, it looks like $Version in CPAN's Bio::Root::Version is
1.005002_102 while in the trunk it's 1.005002_100.

What's the proper way for a Build.PL to be sure that it has a current
bioperl?

g.
Spiros Denaxas | 19 Sep 12:16

code coverage metrics

Hello,

I recently sent Chris an email about an idea I had, quantifying (and
improving) the test coverage that we currently have on the bioperl
core. I know there's a list on the wiki [1] with a list of tests that
have low coverage and/or complete lack of tests. My idea was to
somehow standardize and automate the generation of this list,
preferably on a weekly basis initially. We can then more easily see
where help is needed and possible assign individual tasks to tackle
the most-suffering modules.

There are currently several very good CPAN modules that do this, like
Devel::Cover [2]. Is there any objection if I kick this off and start
doing some work, aiming into creating a more detailed report on code
coverage using the current HEAD and test suite?

Spiros

[1] http://www.bioperl.org/wiki/Untested_Modules_in_BioPerl
[2] http://search.cpan.org/dist/Devel-Cover/lib/Devel/Cover.pm
ANJAN PURKAYASTHA | 18 Sep 23:24

Retrieving taxonomy information from a GenBank file

Hi,
I'm using the following code to access the "ORGANISM" tag value for the
record NC_002526.
I get a value of 0 even though the ORGANISM tag  has a value.  Any idea how
this might be corrected?
Thanks.
Anjan

my $gb= new Bio::DB::GenBank;

my $seq = $gb->get_Seq_by_acc('NC_002526');

my $des= $seq->get_tag_values("ORGANISM");

--

-- 
=============================
anjan purkayastha, phd
bioinformatics analyst
whitehead institute for biomedical research
nine cambridge center
cambridge, ma 02142

purkayas [at] wi [dot] mit [dot] edu
703.740.6939

Gmane