4 Dec 2012 04:35
Re: Bio::DB::Fasta and threads
Carson Holt <carsonhh <at> gmail.com>
2012-12-04 03:35:50 GMT
2012-12-04 03:35:50 GMT
Bio::DB::Fasta is working for maker now. The previous issues have been fixed, but being as Florent has gone out of his way to build a number of improvements into Bio::DB::Fasta over the past few weeks, this seemed like a useful one as well, so I suggested it. One of the big uses of Bio::DB::Fasta is the Bio::PrimarySeq::Fasta features it creates. They are great for manipulating the sequence without actually having to ever keep it in memory. It's nice because the sequence is made available on demand, but when you try and pass them between threads, your program falls apart. There are creative work arounds, but simply adding a serialization hook to Bio::DB::Fasta to disconnect the database on freezing and then reconnect on thaw also fixes it, and it makes them extremely useful for multi-threaded applications without having to go through other kinds of work arounds (it just makes them work as expected with serialization). Previously I had created my own module and inherited from Bio::DB::Fasta so I could implement the Storable hooks. Because Storable looks for the hooks in anything it serializes, the Bio::DB::Fasta object can even be well down inside of a complex object and you don't have worry about it. Previously I've used Storable hooks to pass the Bio::PrimarySeq::Fasta features across the network using MPI, as long as the database is on an NFS mount it just reconnects on the other node with no issue. If the indexed file isn't available after deserialization over a network, you could just throw an error when the thaw hook is called. I'll give Florent's changes a look over soon to give any suggestions. Thanks, Carson On 12-12-03 10:23 PM, "Fields, Christopher J" <cjfields <at> illinois.edu> wrote: >On Dec 3, 2012, at 6:29 PM, Leon Timmermans ><l.m.timmermans <at> students.uu.nl> wrote: > >> On Mon, Dec 3, 2012 at 3:36 AM, Florent Angly <florent.angly <at> gmail.com> >>wrote: >>> The first issue is the serialization of Bio::DB::IndexedBase-inheriting >>> (e.g. Bio::DB::Fasta and Bio::DB::Qual) objects, which is needed for >>> threading (for example when using Thread::Queue::Any). I implemented >>>hooks >>> that make it transparent to serialize using Storable freeze() and >>>thaw(). >> >> I don't think serializing a magical thingie makes much sense. Storable >> is commonly used for a lot more things than interthread communication >> (e.g. network communication), this would often not work under such >> circumstances. >> >> Leon > >Leon, any suggestions on alternatives? I know this particular bit is a >sore spot with MAKER at the moment, so any help would be greatly >appreciated. > >chris >
RSS Feed