skipping a huge text node 2006-06-20 - By Tobias Thierer
Back Hi,
I am trying to parse a very large XML document, 99% of which consists of one huge text node:
<sequence>ACGGAAAT[...]</sequence>
which is too large to fit into memory. So instead of getting the whole String returned by the parser (which won't work because it doesn't fit into memory), I'd like to get just the length of the string and its offset in the XML file, so that whenever I want to access parts of the sequence, I can seek to the correct position and read just the substring that I am interested in.
Is it somehow possible to tell jdom to consume the text node and reporting its offset in the file and its length, rather than storing it in memory?
I've looked at jdom-contrib which provides an ElementListener interface, but that one's elementMatched() method is only called *after* the element (including the close tag) has been fully read. All the classes like SAXBuilder etc. only seem to handle events that come from the parser, but what I want to do is change the events that the parser reports.
Is there any chance to do this with jdom(-contrib)? If not, do you know of any other XML parser with which I could do that?
Cheers,
Tobias
__ ____ ____ ____ ____ ____ ____ ____ ____ ____ To control your jdom-interest membership: http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)
|
|