29 Jan 19:14
Many thanks to...
From: Gilles Lenfant <gilles.lenfant <at> gmail.com>
Subject: Many thanks to...
Newsgroups: gmane.comp.python.lxml.devel
Date: 2008-01-29 18:16:50 GMT
Subject: Many thanks to...
Newsgroups: gmane.comp.python.lxml.devel
Date: 2008-01-29 18:16:50 GMT
The lxml developers great team. I just released some days ago openxmllib, a Python library that extracts text and meta-data from OpenXML documents (MS Office 2007, Apple iWork, and some others) for full text indexing purpose. Perhaps more features in the future. http://code.google.com/p/openxmllib/ Got headaches reading and understanding OpenXML docs. Hopefully, lxml is so easy to work with and so fast... The words of a 60 pages Word .docx document is now extracted in 0.2 seconds instead of 8 seconds on my MacBook and I removed 60% of the code volume since I switched from the standard XML libs that come with Python 2.4. lxml rocks and grooves -- -- Gilles Lenfant gilles.lenfant <at> gmail.com
RSS Feed