Java Mailing List Archive

http://www.junlu.com/

Home » Home (12/2007) » JDOM User »

Re: [jdom-interest] Facing problem reading comments data, need help

Tatu Saloranta

2007-07-06


Alternatively, perhaps the simplest way would be to let TagSoup and JDom add comment nodes, but then just post-process doc and remove them. Using xpath that'd be half a dozen lines of code or less.

-+ Tatu +-

----- Original Message ----
From: Paul Libbrecht <paul@(protected)>
To: Robin Kwek <robin_rspvh@(protected)>
Cc: jdom-interest@(protected)
Sent: Friday, July 6, 2007 3:03:12 PM
Subject: Re: [jdom-interest] Facing problem reading comments data, need help

For the first approach, you could easily try:
- subclassing SAXBuilder
- override createContentHandler
- in a class that extends org.jdom.input.SAXHandler
- the latter of which overrides the comment method (or so) to not
pass it to the parent class
I agree it sounds convoluted but it is fairly easy. If doubtful, you
can see such an extension at:
 http://klein.activemath.org/svn/activemath-svn/src/org/activemath/
omdocjdom/OJSAXBuilder.java

hope that helps

paul


Le 6 juil. 07 ? 23:44, Robin Kwek a ?crit :

> Hi fellow members,
>
> I'm working on a program to analyze web page structural similarity.
> The parser I have is able to work with JDOM and have been able to
> read html files and convert them into respective DOM tree structure.
>
> But there are some web pages using "<!---" and JDOM sounded off
> stating that the data is not legal for a JDOM comment: Comment data
> cannot start with a hyphen, giving an IllegalDataException.
>
> Actually I do not want comments to be read in as I'm primarily
> concerned with the structure of web page, tried searching through
> SAX features and property but I can't find a way to prevent the
> parser or JDOM from reading in comments.
>
> Thus posting this to ask if anyone has a way out to do this?
> Another way I'm thinking of is to turn off the verifier so that the
> illegal comments can be read in and then I can filter them out
> later but don't seems to find the method to turn it off, does
> anyone know where is it in the javdoc?
>
> Thanks in advance.
>
>
> Send instant messages to your online friends http://
> uk.messenger.yahoo.com
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/
> youraddr@(protected)


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)




   ____________________________________________________________________________________
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search
http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz

_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)
©2008 junlu.com - Jax Systems, LLC, U.S.A.