18 May 14:54
Re: Re: Another optimization question
From: anton muhin <antonmuhin <at> gmail.com>
Subject: Re: Re: Another optimization question
Newsgroups: gmane.comp.lang.haskell.cafe
Date: 2008-05-18 12:54:12 GMT
Subject: Re: Re: Another optimization question
Newsgroups: gmane.comp.lang.haskell.cafe
Date: 2008-05-18 12:54:12 GMT
Any chances you're using Integer instead of Int? On my box switch to Integer is quite expensive (as one could expect). yours, anton. On Sun, May 18, 2008 at 9:45 AM, Jeroen <yrn001 <at> gmail.com> wrote: > Daniel Fischer <daniel.is.fischer <at> web.de> writes: >> >> Am Samstag, 17. Mai 2008 22:48 schrieb anton muhin: >> > >> > Why not -O3? >> >> As far as I know - and Brandon S.Allbery said so, too - -O3 isn't better than >> -O2. >> > >> > > Using a real list of primes, >> > >> > What's the size of the real list? >> >> arbitrary, however, since it's [Int], it will actually be at most 105097565 >> primes long, but since only numbers up to 5000 are checked, only 670 primes >> will ever be considered - When I check numbers 1 to 50000 (5133 primes, so >> 5134 primes evaluated), with -O0 / -O2, it's >> Jeroen 1 : 14.5 s / 2.4 s >> Jeroen 2 : 52.5 s / 49.7 s >> (== x) . head . dropWhile (< x) : 8.2 s /4.1 s >> go primes : 5.5 s / 2.5 s >> >> So Jeroen 1 is the best to be optimised :) >> >> > >> but another >> > >> implementation of isPrime: >> > >> >> > >> isPrime x = (== x) $ head $ dropWhile (< x) primes >> > > >> > > With -O2, this is about 20% slower than the Jeroen's first version, >> > > without optimisations 50% faster. >> > > Strange. >> > >> > Well, head has its overhead as well. Cf. two variants: >> > >> > firstNotLess :: Int -> [Int] -> Int >> > firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x >> > >> > dropLess :: Int -> [Int] -> [Int] >> > dropLess s l@(x:xs) = if x < s then dropLess s xs else l >> > >> > isPrime :: Int -> Bool >> > isPrime x = x == (firstNotLess x primes) >> > >> > isPrime' :: Int -> Bool >> > isPrime' x = x == (head $ dropLess x primes) >> > >> > On my box firstNotLess gives numbers pretty close (if not better) than >> >> for primes up to 50000, that's >> 6.8 s / 2.1 s with -O0 / -O2 respectively on mine >> >> > Jeroen's first variant, while head $ dropLess notably worse. >> >> 5.8 s / 2.4 s here. >> > >> > > isPrime :: Int -> Bool >> > > isPrime x = go primes >> > > where >> > > go (p:ps) = case compare x p of >> > > LT -> False >> > > EQ -> True >> > > GT -> go ps >> > > >> > > does best (on my box), with and without optimisations (very very slightly >> > > with -O2) for a list of real primes, but not for [1 .. ]. >> > >> > And what happens for [1..]? >> >> With -O2, Jeroen 1 was best (1.62), nex firstNotLess (1.63), head . dropLess >> (1.74), then in close succesion Jeroen 2 (1.92), go primes (1.96) and head . >> dropWhile (1.99), >> with -O0, it's >> head . dropWhile (1.7 s, YES, that actually is faster with -O0 than with >> -O2!), head . dropLess (2.0), Jeroen 2 and firstNotLess (2.1 s), go primes >> (2.3 s), Jeroen 1 (3.2 s). >> >> Weirder and weirder. >> >> > >> > > However, more than can be squished out of fiddling with these versions >> > > could be gained from a better algorithm. >> > >> > Definitely. >> > >> > yours, >> > anton. >> > > Thanks for the responses so far! > > I only tested with -O2 and my primes implementation > is the Sieve of Eratosthenes and has signature > > primes :: Integral a => [a] > > What's also quite strange is that experiment2 is about > 10 times time slower than experiment1 when > using primes (with the Eratosthenes formula) instead of [1..]. > > I redid the experiments with -prof and the output was quite revealing: > > experiment1 (fastest): > total time = 2.64 secs (132 ticks @ 20 ms) > total alloc = 323,356 bytes (excludes profiling overheads) > > individual inherited > COST CENTRE entries %time %alloc %time %alloc > > MAIN 0 0.0 0.0 100.0 100.0 > main 1 0.0 0.5 0.0 0.5 > CAF 4 0.0 0.0 100.0 99.0 > primes 1 9.8 61.9 9.8 61.9 > main 0 0.0 37.1 90.2 37.1 > isPrime 5000 90.2 0.0 90.2 0.0 > CAF 4 0.0 0.4 0.0 0.4 > > > experiment2 (slowest): > total time = 6.12 secs (306 ticks @ 20 ms) > total alloc = 350,473,356 bytes (excludes profiling overheads) > > individual inherited > COST CENTRE entries %time %alloc %time %alloc > > MAIN 0 0.0 0.0 100.0 100.0 > main 1 0.0 0.0 0.0 0.0 > CAF 4 0.0 0.0 100.0 100.0 > primes 1 0.0 0.1 0.0 0.1 > main 0 0.0 0.0 100.0 99.9 > isPrime 5000 100.0 99.9 100.0 99.9 > CAF 4 0.0 0.0 0.0 0.0 > > > Would this be only because isPrime of experiment 2 builds > this temporary list (takeWhile) all the time? > > Jeroen Baekelandt > > > > > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe <at> haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe >
RSS Feed