DJ Jazzy Linefeed | 16 May 03:53
Picon

Differing output for function moved from ruby 1.8 to 1.9

Sup, fools?

This is the Levenshtein function I'm gankin' for my file comparison
project (see "40 million comparison..." thread):

# Levenshtein calculator
# Author: Paul Battley (pbattley <at> gmail.com)
# Modified slightly by John Perkins:
# -- removed $KCODE call

def distance(str1, str2)

  unpack_rule = 'C*'
  s = str1.unpack(unpack_rule)
  t = str2.unpack(unpack_rule)
  n = s.length
  m = t.length

  return m if (0 == n) # stop the madness if either string is empty
  return n if (0 == m)

  d = (0..m).to_a
  x = nil

  (0...n).each do |i|
    e = i + 1
    (0...m).each do |j|
      cost = (s[i] == t[j]) ? 0 : 1
      x = [
        d[j + 1] + 1,   # insertion
        e + 1,          # deletion
        d[j] + cost     # substitution
      ].min
      d[j] = e
      e = x
    end
    d[m] = x
  end

  return x
end

When I ran this with test data in ruby 1.8 the output was 969, but
when I ran it on a 1.9 install the output was 1011.  I'm aware that
some of the rules have changed, especially with arrays. Does anyone
see where the discrepancy lies, because I sure as heck don't. The
files didn't change so the distance shouldn't either. Thanks for all
your help in advance.


Gmane