Gravatar

Re: String += vs <<

On Thu, Jun 18, 2009 at 3:08 AM, Robert Dober<robert.dober <at> gmail.com> wrote:
> ---------------------------------------------------------
> 512/19 > cat strings.rb
>
> N = 10_000
> b = "Wassitmean"
> require 'benchmark'
> Benchmark.bmbm do | bench |
>  a = "Ruby Rules Re Rowld"
>  bench.report "+=" do
>    N.times do
>      a += b
>    end
>  end
>  a = "Ruby Rules Re Rowld"
>  bench.report "<<" do
>    N.times do
>      a += b
>    end
>  end
> end

Someone else noted the += in the << section, but there's another
issue: the "a" string is initialized only *once* for both rehearsal
and actual runs, since the body of the bmbm block is only executed
once to prepare the reports. If you modify it to put the a
initialization into the report blocks, it behaves more like you'd
expect. Here's a run with JRuby, with the bmbm above, "a" init fix,
"<<" fix, and 5 iterations (only last iteration shown):

Rehearsal --------------------------------------
+=   0.343000   0.000000   0.343000 (  0.343000)
<<   0.001000   0.000000   0.001000 (  0.001000)
----------------------------- total: 0.344000sec

         user     system      total        real
+=   0.343000   0.000000   0.343000 (  0.343000)
<<   0.001000   0.000000   0.001000 (  0.001000)

Here's JRuby all interpreted (no JIT compilation to bytecode):

Rehearsal --------------------------------------
+=   0.345000   0.000000   0.345000 (  0.345000)
<<   0.002000   0.000000   0.002000 (  0.002000)
----------------------------- total: 0.347000sec

         user     system      total        real
+=   0.356000   0.000000   0.356000 (  0.356000)
<<   0.002000   0.000000   0.002000 (  0.002000)

The numbers are basically the same because this bench is almost
completely limited by object allocation/GC and to a lesser extent
String performance for the two operations. But obviously << is faster
because it's growing the backing buffer for a single String rather
than creating a new one each time and copying the contents of the
previous string.

Here's the same in Ruby 1.9:

Rehearsal --------------------------------------
+=   0.260000   0.510000   0.770000 (  0.766618)
<<   0.000000   0.000000   0.000000 (  0.002294)
----------------------------- total: 0.770000sec

         user     system      total        real
+=   0.250000   0.510000   0.760000 (  0.771757)
<<   0.000000   0.000000   0.000000 (  0.002235)

This was JRuby 1.4.0dev on current Apple Java 6.

> 513/20 > jruby -v strings.rb
> jruby 1.3.0 (ruby 1.8.6p287) (2009-06-06 6586) (OpenJDK Client VM
> 1.6.0_0) [i386-java]
> Rehearsal --------------------------------------
> +=   1.256000   0.000000   1.256000 (  1.191000)
> <<   9.384000   0.000000   9.384000 (  9.384000)
> ---------------------------- total: 10.640000sec
>
>         user     system      total        real
> +=  23.397000   0.000000  23.397000 ( 23.397000)
> <<  52.953000   0.000000  52.953000 ( 52.953000)

Server would perform a lot better here, but I suspect the fact that
the "a" string was never re-initialized and just kept getting bigger
was the main reason for this peculiar result.

> ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]
> Rehearsal --------------------------------------
> +=   0.360000   0.020000   0.380000 (  0.406038)
> <<   1.040000   0.130000   1.170000 (  1.209839)
> ----------------------------- total: 1.550000sec
>
>         user     system      total        real
> +=   1.770000   0.230000   2.000000 (  2.056577)
> <<   2.410000   0.240000   2.650000 (  3.456429)

I'm not sure why Ruby 1.9 did better here, but it could be that we
grow strings at different rates and so our strings get larger faster.
At any rate, in the fixed benchmark things look a lot better.

- Charlie


Gmane