David Gowers | 2 Sep 2010 05:12
Picon

Re: Progress?

On Thu, Sep 2, 2010 at 1:53 AM, Ralph Versteegen <teeemcee <at> gmail.com> wrote:
> On 1 September 2010 15:20, David Gowers <00ai99 <at> gmail.com> wrote:
>> On Tue, Aug 31, 2010 at 11:43 PM, Ralph Versteegen <teeemcee <at> gmail.com> wrote:
>>> On 31 August 2010 02:34, David Gowers <00ai99 <at> gmail.com> wrote:
>>>> On Mon, Aug 30, 2010 at 9:52 PM, Ralph Versteegen <teeemcee <at> gmail.com> wrote:
>>>>> I tried it out, discovered it was broken, and eventually discovered
>>>>> that this was due (only) to errors in the RELOAD documentation. Nice
>>>>> work! Patch attached.
>>>>
>>>> Thanks! Pushed to git master + fixed the doctests that had broken due
>>>> to the stringtable changes.
>>>
>>> Ah, since I don't have nose I forgot about the doctests.
>>
>> Hehe, you don't need nose to do doctesting. but okay.
>
> Well, I did manually copy the lines into a python shell to try them
> out, if that's what you mean.

Ohh. No, doctesting is built into Python.
http://docs.python.org/library/doctest.html

-> see section 25.2.2

"python -m doctest -v doc/reload.rst" would be appropriate in this context.

(it's rather more verbose, but essentially the same)

>
>> ----
>>
>> import yaml
>> f = open (myfile,'rb')
>> y = yaml.safe_load(f)
>> f.close()
>> from nohrio.reload import reload_from_dict
>> r = reload_from_dict (y, 'root')
>> f = open (myoutfile,'wb')
>> r.write_root(f)
>
> Very nice.
Of course, given your example, I now realize there is no way to do >1
of a same-named node.
I'll fix that as well (reload_from_tuples())

>>>>> It now correctly reads and writes all RELOAD
>>>>> documents that I threw at it, including the unittest.rld that
>>>>> reloadtest produces. Also, I reimplemented reload2xml in a couple
>>>>> lines (MUCH nicer than the 'real' thing :) ),
>>>> lxml2 is pretty nice :) I had it's model in mind vaguely when
>>>> implementing my system.
>>>
>>> I might take a look at lxml2. The official reload interface still
>>> feels unfinished to me.
>>
>> Oh, k.. just had assumed you used that, since it's available in many
>> python installations.
>
> What, used XML? Of course not!
(that = lxml2. I suppose you could have just manually written XML
using sys.stdout.write () etc..)

>
> Actually, to be honest I don't use python for more than 20-liners,
> although I really wish I did. The problem is that all the projects I'm
> involved in are written in nasty languages like FB!

>
> I meant that you compared {'a':0, 'a':1} and {'a':1, 'a':0} as inequal.
oh. I didn't consider repeated nodes.
In fact, my module explicitly fails on those >_< ('a child already
exists with the name "%s"')

Fixed.

>
> But, Mike was quite right that order does matter, I had confused
> myself. However in practice, order doesn't matter for most of of our
> RELOAD-based file formats. For example, the zone file format right now
> contains the list of zones ordered as they appear in a hash table -
> essentially randomly. And all these zones are stored in nodes named
> "zone" (with null values) indistinguishable to a first degree.
>
> So, could you please add the previous elements_equal function back?
> Even better, this function, which further allows identically named
> children to be out of order:

>
> def fuzzy_equal (x, y):
>    if x.name != y.name:
>        return False
>    if x.data != y.data:
>        return False
>    if len(x.children) != len (y.children):
>        return False
>    xh = [hash(v) for v in x.children]
>    yd = dict((hash(v),v) for v in y.children)
>    if set(xh) != set(yd.keys()):
>        return False
>    return all(fuzzy_equal(c1, yd[c1]))
hm.
I can probably implement this by modifying the hashing function slightly.
So I'll do that and then write some doctests for it.

> I don't know whether those are correct, because python isn't installed
> on this laptop :(. I also wonder whether there is a more efficient
> method that doesn't require temporary sets, and whether it can be made
> to not assume equally hashing nodes are equal.

that's acceptable for now. (only 1/2^(32|64) comparisons of that kind
will give a false positive)
See the current fuzzy_eq implementation :)

David

Gmane