Hi Richard,
Sorry for the long delay. I had a chance to look at your problem. Indeed, this
is a problem in ElementScanner and your analysis is correct.
> To fix, I removed the if (this.activeRules.size() != 0) test that contained
> the startElement() call to XMLScanner, so that it always propogates the
> event to the SAXHandler.
Your fix proposal to always propagate the startElement events to SAXHandler is
quite dangerous as it forces SAXHandler to build a full JDOM document from the
parser output (which is pr�cisely what ElementScanner aims at avoiding).
Thus, I think we should keep the "if (this.activeRules.size() != 0)" test to
support extracting some nodes from huge document while using as little memory
as possible.
Attached is another patch proposal: Instead of directly using SAXHandler, it
relies on a subclass (FragmentHandler, borrowed from JDOMResult) that inserts
a dummy root document in SAXHandler's document.
This guarantees that, whatever your matching rules, SAXHandler will always
have a single root document.
What do you think,
Laurent
Richard Allen wrote:
> Hi All,
>
> With the following XML:
> <blah>
> <huh>1234</huh>
> <blam>
> <yay>woohoo</yay>
> </blam>
> <blam>
> <yay>mwuhahaha</yay>
> </blam>
> <nah>5678</nah>
> </blah>
>
> And listeners on the following:
> /blah/huh
> /blah/blam
>
> The /blah/huh element is processed sweet as..
> But when the /blah/blam element is being processed, the
> SAXHandler.startElement() throws the following exception:
>
>
org.xml.sax.SAXException: Ill-formed XML document (multiple root elements
> detected)
> at
org.jdom.input.SAXHandler.getCurrentElement (
SAXHandler.java:906)
> at
org.jdom.input.SAXHandler.startElement (
SAXHandler.java:553)
> at
> org.jdom.contrib.input.scanner.ElementScanner.startElement(ElementScanner.java:554)
>
> This is a bit weird, given that the //blam element isn't the root element
> ;-)
>
> The problem is that the XMLScanner is not being notified until after the
> first element that contains active rules has been found.
> This causes SAXHandler to think that the /blah/huh element is actually the
> root.
> When the ElementScanner notifies SAXHandler of the /blah/blam element it
> throws a hissy fit as it has already ended what it thinks is the root
> element ;-)
>
> To fix, I removed the if (this.activeRules.size() != 0) test that contained
> the startElement() call to XMLScanner, so that it always propogates the
> event to the SAXHandler.
>
> Comments appreciated as to whether this fix is the ideal fix, or if there
> is a better way to fix this problem.
> cheers,
> Rich