Carson Holt | 4 Dec 04:35 2012

Re: Bio::DB::Fasta and threads

Bio::DB::Fasta is working for maker now.  The previous issues have been
fixed, but being as Florent has gone out of his way to build a number of
improvements into Bio::DB::Fasta over the past few weeks, this seemed like
a useful one as well, so I suggested it.  One of the big uses of
Bio::DB::Fasta is the Bio::PrimarySeq::Fasta features it creates.  They
are great for manipulating the sequence without actually having to ever
keep it in memory.  It's nice because the sequence is made available on
demand, but when you try and pass them between threads, your program falls
apart. There are creative work arounds, but simply adding a serialization
hook to Bio::DB::Fasta to disconnect the database on freezing and then
reconnect on thaw also fixes it, and it makes them extremely useful for
multi-threaded applications without having to go through other kinds of
work arounds (it just makes them work as expected with serialization).
Previously I had created my own module and inherited from Bio::DB::Fasta
so I could implement the Storable hooks.  Because Storable looks for the
hooks in anything it serializes, the Bio::DB::Fasta object can even be
well down inside of a complex object and you don't have worry about it.
Previously I've used Storable hooks to pass the Bio::PrimarySeq::Fasta
features across the network using MPI, as long as the database is on an
NFS mount it just reconnects on the other node with no issue.  If the
indexed file isn't available after deserialization over a network, you
could just throw an error when the thaw hook is called.  I'll give
Florent's changes a look over soon to give any suggestions.


On 12-12-03 10:23 PM, "Fields, Christopher J" <cjfields <at>>

>On Dec 3, 2012, at 6:29 PM, Leon Timmermans
><l.m.timmermans <at>> wrote:
>> On Mon, Dec 3, 2012 at 3:36 AM, Florent Angly <florent.angly <at>>
>>> The first issue is the serialization of Bio::DB::IndexedBase-inheriting
>>> (e.g. Bio::DB::Fasta and Bio::DB::Qual) objects, which is needed for
>>> threading (for example when using Thread::Queue::Any). I implemented
>>> that make it transparent to serialize using Storable freeze() and
>> I don't think serializing a magical thingie makes much sense. Storable
>> is commonly used for a lot more things than interthread communication
>> (e.g. network communication), this would often not work under such
>> circumstances.
>> Leon
>Leon, any suggestions on alternatives?  I know this particular bit is a
>sore spot with MAKER at the moment, so any help would be greatly