Java Mailing List Archive

http://www.junlu.com/

Home » Home (12/2007) » JDOM User »

[jdom-interest] Resolving Entities...when no DTD is assigned (not
 DOCTYPE declaration) in XML

Vish D.

2005-08-31

Replies:

Hello all,

I am having some trouble figuring out how to go about resolving entities when an XML file doesn't have DOCTYPE declaration (no DTD attached to it), but contains entities that are 'non-standarad' (such as, ' ', etc...). I need to do this in such a way that I don't change the XML file (without added DOCTYPE declaration, etc..).

My need for the above is as follows:

SAXBuilder builder = new SAXBuilder();
....
fulltextXML = builder.build(new FileInputStream(filename));

-- fails with an exception ---

C:\HTMLs\00063185_200_1_67\00063185_200_1_67_Document.xml is not well-formed.
org.jdom.input.JDOMParseException: Error on line 5: The entity "nbsp" was referenced, but not declared.
Error on line 5: The entity "nbsp" was referenced, but not declared.


Is there a way to resolve such entities, without having to declare the DOCTYPE in the XML file?



Thanks in advance!

Vish


Sample XML file:

XML FILE
--------------

<?xml version="1.0" encoding="UTF-8"?>
<object_document>
    <art_title>        Muscular Alteration of Gill Geometry in vitro: Implications for Bivalve Pumping Processes -- Medler and Silverman 200 (1): 77 -- The Biological Bulletin</art_title>
    <converted_from type='HTML'>BiolBull V 200 I 1 P 77 Fulltext 00063185.htm</converted_from>
    <fulltext>&nbsp;Biol. Bull.  200: 77-86. (February 2001)&#169; 2001 Marine Biological LaboratoryMuscular Alteration of Gill Geometry in vitro: Implications for Bivalve Pumping ProcessesScott Medler* and Harold SilvermanLouisiana State University, Baton Rouge, Louisiana 70803* Author to whom correspondence should be addressed. Current address: Department of Biology, Colorado State University, Ft. Collins, CO 80523. E-mail: Skmedler{at}aol.com<!-- var u = "Skmedler", d = "aol.com"; document.getElementById("em0").innerHTML = "" + u + "@" + d + ""//-->
&nbsp;Received 23 March 2000; accepted 19 October 2000.
</fulltext>
    <jrnl_title>BiolBull</jrnl_title>
    <issn>00063185</issn>
    <volume>200</volume>
    <issue>1</issue>
    <fpage>77</fpage>
</object_document>






_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)
©2008 junlu.com - Jax Systems, LLC, U.S.A.