20 Sep 2009 17:18
Re: Load an XML document w/ internal DTD
Normen Müller schrieb: > Any ideas? ConstructingParser is non-validating, my first idea was the code at the end of this mail, but that ran afoul of some broken SYSTEM id resolution logic: hars <at> st11:~/tmp$ scala parseFromURL java.io.FileNotFoundException: /home/hars/tmp/book.xml./dtd/docbook-4.4/docbookx.dtd (No such file or directory) Then I tried to use this file instead: <?xml version="1.0"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [ <!ENTITY foreword SYSTEM "foreword.xml" > ]> <book id="foo"> &foreword; </book> and ended with: http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd:81:1: markupdecl: unexpected character ']' #93]]>^ http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd:81:2: markupdecl: unexpected character ']' #93]]> ^ http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd:81:3: markupdecl: unexpected character '>' #62]]> ^ http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd:113:30: ']' expected instead of '"'<!ENTITY euro SDATA "[euro ]"><!-- euro sign --> ^ java.lang.RuntimeException: FATAL Using the 4.1 DTD that error is gone, but the resulting output is: <book id="foo"><!-- foreword; --></book> This bit from the MarkupHander documentation may be relevant: Todo can we ignore more entity declarations (i.e. those with extIDs)? expanding entity references - Florian import scala.xml.parsing._ import scala.xml._ import java.io.File import scala.io.Source class ConstructingValidatingParser(val input: Source, val preserveWS: Boolean) extends ValidatingMarkupHandler with ExternalSources with MarkupParser { /* Copy & Paste, since ConstructingHandler is an abstract class, not a trait */ def elem(pos: Int, pre: String, label: String, attrs: MetaData, pscope: NamespaceBinding, nodes: NodeSeq): NodeSeq = Elem(pre, label, attrs, pscope, nodes:_*) def procInstr(pos: Int, target: String, txt: String) = ProcInstr(target, txt) def comment(pos: Int, txt: String) = Comment(txt) def entityRef(pos: Int, n: String) = EntityRef(n) def text(pos: Int, txt:String) = Text(txt) } object parseFromURL { def main(args:Array[String]): Unit = { val src = scala.io.Source.fromURL("file:book.xml"); val cpa = new ConstructingValidatingParser(src, false); cpa.nextch val doc = cpa.document(); // let's see what it is val ppr = new scala.xml.PrettyPrinter(80,5); val ele = doc.docElem; Console.println("finished parsing"); val out = ppr.format(ele); Console.println(out); } }
RSS Feed