Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Jason Stajich <jason.stajich <at> gmail.com>
Subject: Re: Perl script uses all my cpu
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Thursday 31st October 2013 05:33:57 UTC (over 3 years ago)
Couple things if speed is your primary concern you want to go lower down I
think and avoid using the modules unless necessary.

a) SearchIO parsing is slow -- if you want speed dump the data to tabular
format -m 8 and just parse the columns with split.

b) if you specify the input filename as - you can pass in the sequence data
as a input string instead of having to create the kmer file. Or will also
be faster to pipeline you analysis of many kmers at once rather than
creating one file at at time.

you can do it through STDIN

open(my $fh => "| fasta36 -T 1 -E 1e-5 - databasename > outfile") || die
$!;
print $fh ">kmer$l\n",$seq,"\n";

open(my $infh => "outfile" ) || die $!;

c) if you wanted to be even clever-er and not ever create files try
IPC::Open2 - http://perldoc.perl.org/IPC/Open2.html
you could push seq data in through STDIN and get the SearchIO output from
STDOUT - you would just print to the $infh and initialize a Bio::SearchIO
object reading from output -- though there is some buffering that has to
happen to wait on something running.

But it would be simpler to to write to /dev/shm/outfile or a fast SSD drive
or /tmp and read back from that file. Could also keep the filehandle open
and rewind too if you wanted to.  I would 

Jason

On Oct 29, 2013, at 10:24 PM, Jason Stajich <[email protected]>
wrote:

> are you sure it is bioperl that is causing this - if you run top I am
sure it is the fasta command that is causing this:
> 
> Also not sure why you initialize searchIO twice. Just initialize it in
the loop where you use it.
> 
> 
> Is the CPU just from running the application fasta36 ? you can specify
the number of threads with -T  -- so to ask for 1 processor add "-T 1" to
your fasta cmd
> 
> On Oct 29, 2013, at 2:05 PM, Antony03  wrote:
> 
>> Hi,
>> 
>> I wrote this perl script http://pastebin.com/PWVKvcQ6 and it
uses bioperl
>> modules. It works well (I think) but it uses all my cpu (8)...i don't
>> understand why.
>> 
>> Is someone know how execute my code on only one cpu?
>> 
>> Thanks!
>> 
>> Antony
>> 
>> 
>> 
>> --
>> View this message in context: http://bioperl.996286.n3.nabble.com/Perl-script-uses-all-my-cpu-tp17189.html
>> Sent from the Bioperl-L mailing list archive at Nabble.com.
>> _______________________________________________
>> Bioperl-l mailing list
>> [email protected]
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Jason Stajich
> [email protected]
> [email protected]
> 

Jason Stajich
[email protected]
[email protected]
 
CD: 3ms