Fields, Christopher J | 13 Sep 15:01 2013

Re: Standaloneblastplus: update

If you want to include this as an edit to the code, you could simply fork the code on github, make and commit the
changes to the fork, then submit a pull request.  We do request that you test the code out or add new tests
(e.g. using the bioperl test suite) prior to submitting it, just to make sure everything works fine.

chris

On Sep 13, 2013, at 3:30 AM, dimitark <at> bii.a-star.edu.sg wrote:

> Hi guys,
> i managed to solve my problem by modifying the StandAloneBlastPLus.pm and BlastMethods.pm.
> 
> In 'sub run()' from BlastMethods added an option TEMPDIR:
> 
> sub run {
>    my $self = shift;
>    my  <at> args =  <at> _;
>    # DIMITAR: added $tempdir so i can pass a tempdir for each thread i create
>    my ($method, $query, $outfile, $outformat, $method_args,$tempdir) = $self->_rearrange( [qw(
> 					METHOD
>                                         QUERY
>                                         OUTFILE
>                                         OUTFORMAT
>                                         METHOD_ARGS
> 					 TEMPDIR
>                                         )],  <at> args);
> 
> Then line 261 in BlastMethods, passing the tempdir:
> 
>    $blast_args{-query} = $self->_fastize($query,$tempdir);
> 
> Then in StandAloneBlastPLus in _fastize():
> 
>   sub _fastize {
>    my $self = shift;
>    my $data = shift;
>    my $tempdir=shift; # <--- ADDED THIS
> 
> 
> And further changed here:
> 
>   		my $fh = File::Temp->new(TEMPLATE => 'DBDXXXXXXXXXX',
> 					 UNLINK => 0,
> 					 DIR => $tempdir, # <--- CHANGED HERE
> 					 SUFFIX => '.fas');
> 
> 
> Well its quite dirty workaround but it works fine. Now i can do the following:
>  In my script i can create start several threads which have the same factory and for each thread i create a
separate TEMPDIR in which is created the temp .FAS(holding the query). That way i can make a better use of my
CPU threads.
> For example: instead of running a single blast with 40 CPU threads which process a fasta file with 250K seqs
now i can start 5 instances of blast processing 50K seqs each. And each instance using 8 CPU threads.
> 
> I did this because:
> 
> a) when i run several instances of blast and they all create their temp files in the same directory. And even
tho the temp files use this RAND mix of characters still some weird errors were happening and some blast
instances were broken.
> 
> b) i noticed that when i process a large fasta file the blast at first starts well but is getting slower with
time. I mean slower with each seq being blasted. The further down the fasta the slower the blast.
> 
> If someone else is interested in this kind of functionality i suppose i can edit further the file so that is
cleaner and consistent throughout. Also now the tempdir must be explicitly  given i can make it like that:
> 
> if(! $tempdir){
>   $tempdir=$self->db_dir;
> }
> 
> which will default it as before in DB_DIR. Or some other way which achieves the same.
> 
> Cheers
> D.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l <at> lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Gmane