Martijn Faassen | 9 Nov 2004 18:12
Favicon

Re: vlibxml2 and lxml

Hey,

[Philipp (hi, I'm back!), please read to the bottom for where you come in]

Victor Ng wrote:
> Actually - I started working on the vlibxml2 stuff prior to the 
> announcement of the lxml project. 

Oh, just to make it clear: I announced it back at EuroPython in early 
june, but it's easy to miss such announcements. The svn and stuff here 
came after that; the original code is in the Infrae cvs, here:

http://cvs.infrae.com/packages/lxml/

Then again, you might've been at it for a while too for all I know. I'm 
glad I caught you on the pyrex list and we can work together; it's 
already been beneficial to both of us, I hope.

> I was also negotiating an agreement 
> at my workplace so that I can work on the XML library during office 
> hours as long as the license was a BSD license.

Cool! Infrae has a 'BSD everything by default' policy, but that's 
because I co-own the company. :)

> All of that has been sorted out now - and I've really got no time to 
> setup everything you folks have already done at codespeak.

Of course that's mostly the work of Holger Krekel, helped by Philipp von 
Weitershausen; I can't really take much credit for it, just be glad that 
they're my friends.

> I'd really like to keep vlibxml2 and lxml separate.  This is mostly for 
> technical reasons as the libxml2 library has some really quirky behavior 
> in it's API.

So what about a source distribution that contains both libraries (if 
they're done at all, vlibxml2 is obviously much further in that 
department)? vlibxml2 contains a lot of very important foundational work 
concerning memory management that I hope we can get the higher level 
lxml stuff to use as well after a bit of refactoring.

So, what I'm proposing is merging the vlibxml2 into lxml's 'src' 
directory, but being its own package. What do you think? Of course we'd 
also need to figure out what to do with the extensions package; I'm not 
familiar enough with vlibxml2's source layout to know where everything 
goes. Ideas?

If you'd like and can come up with a good name, we could rename the 
whole 'lxml' distribution into something else that may be more neutral. 
We could then call it all <foo>.libxml2, <foo>.libxslt, <foo>.dom and 
<foo>.elementtree. I.e. one top level package to make the namespaces 
clear, with sub-modules/packages that offer particular functionalities. 
vlibxml2 would become 'libxml2'. Though perhaps this all promises *too* 
much API compatibility with the original libxml2/elementtree/etc for us 
to feel comfortable about?

> I'd actually like to rewrite a lot of the vlibxml2 code 
> now that I understand the idioms in libxml2 a little better.  I guess 
> you really do need to do it 87 times before you get it right.  :)

Sounds very familiar indeed. :) I rewrote parts of lxml a few times 
already, still haven't gotten it right.

> So - in the interest of playing nice with everyone - can I get checkin 
> privs to the lxml SVN repository?

Sure! I'll get Philipp to contact you about it; I'll cc him about it. Hi 
Philipp!

Regards,

Martijn

Gmane