24 Jun 2012 11:09
Re: Support integration with other tree changing libxml2 based libraries
Dieter Maurer <dieter <at> handshake.de>
2012-06-24 09:09:19 GMT
2012-06-24 09:09:19 GMT
Stefan Behnel <stefan_ml <at> behnel.de> writes:
> Dieter Maurer, 23.06.2012 12:02:
> ...
>> I propose that future `lxml` versions should include a public
>> `safe_release` function for such purposes.
>
> Maybe a new "removeNodeFromDocument()" API function could first check for
> proxies, and then either deallocate or fix up the tree to be stand-alone.
That would be ideal.
> ...
>> Another, but less serious problem: some `libxmlsec` functions
>> replace a node inside the tree (e.g. a node is replaced by an
>> `EncryptedData` node representing the node in an encrypted form).
>> It would be nice if I could "retarget" an `lxml` proxy referencing
>> the replaced node to point to the replacing node. This way,
>> `lxml` objects with references to the proxy would see the new
>> state rather then the confusing picture resulting from the proxy
>> now refering to an unlinked node.
> ...
>> Of course, the "retarget"ing is not trivial. It is not sufficient
>> to give the proxy a new "_c_node"; its class, too, might need to
>> be adapted. This were possible as long as the two classes
>> had the same "C" layout for their objects. Is `lxml` supposed
>> to support proxy classes with differing "C" layout (I expect "yes"
>> as answer).
>
> From the POV of lxml the proxy is just a reference to an object of type (or
> subtype of) _Element. The problem is that the user most likely holds
> another reference to it
This means, one cannot replace the proxy object by a new one
but one could change the proxy object content (e.g. set a new "_c_node",
set a new "__class__").
As I understood, "lxml" ensures that there is at most one proxy
for any given "c_node" (by putting a proxy reference into the
"_private" of the "c_node"). Thereby, changing the proxy content
changes all "views" of the "lxml" application on the respectice
"c_node".
> and there is no way we can exchange the object (or
> even its class) that that reference points to. These things are a lot less
> trivial at the C level than in Python (and even there they can have
> surprising side effects).
I am not sure that I understand your argument (though I fully
appreciate your reluctance to provide a public API).
In my case, I am not inside a complicated `lxml` context where
`lxml` code could hold direct references to internal attributes
of the proxy I want to retarget. The only such references
are in my binding function -- and of course, I must ensure that
they do not get confused.
>> For the moment, I will tell the user of my `libxmlsec` binding:
>> forget any `lxml` reference into an encrypted or decrypted document,
>> including a reference to its root tree and always rebuild
>> references from the operation's return value.
>
> Basically, what this means is that Elements that the user holds a reference
> to won't change during the transformation but may no longer be at their
> original place afterwards.
The worst behaviour I have observed:
doc = parse(StringIO("<?...><!-- ... --><Envelope>...</Envelope>"))
encrypt(..., doc.getroot())
print tostring(doc)
<Envelope>...</Envelope>
That means that encrypting the root node of an "_ElementTree"
has stripped this tree of its processing instruction and its comment.
I understand why this happens but from a user perspective, it can
be really surprising.
> Perfectly reasonable if you ask me, because
> changing the tree is the whole point of doing that transformation. The same
> happens in XInclude, for example. Or even just when you change the tag name
> of an Element. None of those cases replaces the implementation of an
> Element that the user holds. After all, he or she could still need the
> original Element for some reason.
As the example above shows, he neither sees the original nor
the new element.
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
RSS Feed