Jason Stajich | 31 Oct 06:33 2013
Picon

Re: Perl script uses all my cpu

Couple things if speed is your primary concern you want to go lower down I think and avoid using the modules
unless necessary.

a) SearchIO parsing is slow -- if you want speed dump the data to tabular format -m 8 and just parse the columns
with split.

b) if you specify the input filename as - you can pass in the sequence data as a input string instead of having
to create the kmer file. Or will also be faster to pipeline you analysis of many kmers at once rather than
creating one file at at time.

you can do it through STDIN

open(my $fh => "| fasta36 -T 1 -E 1e-5 - databasename > outfile") || die $!;
print $fh ">kmer$l\n",$seq,"\n";

open(my $infh => "outfile" ) || die $!;

c) if you wanted to be even clever-er and not ever create files try IPC::Open2 - http://perldoc.perl.org/IPC/Open2.html
you could push seq data in through STDIN and get the SearchIO output from STDOUT - you would just print to the
$infh and initialize a Bio::SearchIO object reading from output -- though there is some buffering that
has to happen to wait on something running.

But it would be simpler to to write to /dev/shm/outfile or a fast SSD drive or /tmp and read back from that
file. Could also keep the filehandle open and rewind too if you wanted to.  I would 

Jason

On Oct 29, 2013, at 10:24 PM, Jason Stajich <jason.stajich <at> gmail.com> wrote:

> are you sure it is bioperl that is causing this - if you run top I am sure it is the fasta command that is causing this:
> 
> Also not sure why you initialize searchIO twice. Just initialize it in the loop where you use it.
> 
> 
> Is the CPU just from running the application fasta36 ? you can specify the number of threads with -T  -- so to
ask for 1 processor add "-T 1" to your fasta cmd
> 
> On Oct 29, 2013, at 2:05 PM, Antony03 <antony.vincent.1 <at> ulaval.ca> wrote:
> 
>> Hi,
>> 
>> I wrote this perl script http://pastebin.com/PWVKvcQ6 and it uses bioperl
>> modules. It works well (I think) but it uses all my cpu (8)...i don't
>> understand why.
>> 
>> Is someone know how execute my code on only one cpu?
>> 
>> Thanks!
>> 
>> Antony
>> 
>> 
>> 
>> --
>> View this message in context: http://bioperl.996286.n3.nabble.com/Perl-script-uses-all-my-cpu-tp17189.html
>> Sent from the Bioperl-L mailing list archive at Nabble.com.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l <at> lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Jason Stajich
> jason.stajich <at> gmail.com
> jason <at> bioperl.org
> 

Jason Stajich
jason.stajich <at> gmail.com
jason <at> bioperl.org

Gmane