Java Mailing List Archive

http://www.junlu.com/

Google
Google
Mailing List
Home
Forum Home
JBoss - Java Application Server
Struts - A MVC web framework
Tomcat - JSP/Servlet container
iText - An open source PDF Java Library
JDOM - JDOM XML Parser
J2EE - A mailing list for Java(tm) 2 Platform, Enterprise Edition
J2EE Pattern - An interest list for Sun Java Center J2EE Pattern Catalog
Servlet - A mailing list for discussion about Sun Microsystem's Java Servlet API Technology
JSP - A mailing list about Java Server Pages specification and reference
Struts & Hibernate
Subjects
JSP editor plugin for eclipse ?
org apache jasper JasperException: Unable to compile class for JSP
Tomcat: Connection reset by peer: socket write error
Cannot retrieve definition for form bean null
Struts Tiles Tutorial (free Struts training)
Where do I download Tomcat 4 0 6?
Data Access Object (DAO) pattern, example DAO 's
Where to download Tomcat v 4 1 24 from?
Tomcat 5 0 16 Requested resource not available
Oracle Connection Pooling in 3 2 2
Servlet : Session invalidate
Servlet action is currently unavailable
Tomcat/Struts Unicode Encoding/Decoding problems
Tomcat and webapplication specific java library path
Running a Simple JMS Example
Mapping in workers2 properties
org apache jasper JasperException
Cannot find message resources under key org apache struts action
   MESSAGE
problem with html:text bean throwing exception
Cannot find message resources under key org apache struts action MESSAGE
invalid direct reference problem with solution
Tool for jsp debug Try Sysdeo Eclipse Plugin
Tomcat 5 Cannot load JDBC driver class 'null ' SQL state: null
weblogic ejbc
java properties file
Jboss 3 2 3 Coyote Can 't re
Tomcat 5, Apache2 and mod jk2 integration problem
JBoss example problem new to J2EE
url string for connecting jboss to oracle
Value attribute of <html:checkbox
javax servlet ServletException: BeanUtils populate
HTTP Status 404 The requested resource is not available
5 0 18: Windows XP Pro vs Windows 2000
 
Verbose XHTML 1.1 Doctype

Verbose XHTML 1.1 Doctype

2004-03-24       - By David Dorward

 Back
Reply:     1     2     3  

I have a number of XHTML 1.1 documents, all conforming to the same
template, which I want to extract some data from and then insert that
data into different XHTML 1.1 documents.

As a first step I am trying to read in a document and then print it out
again without any modification. I've run into two issues:

1. It appears to be downloading the DTD from the w3c website - this
takes time and bandwidth.

2. It seems to be expanding the Doctype line (example below).

Is there any way to stop this? I'd like to leave the Doctype alone and
save time on reading the DTD (I don't care about validation - that is
handled elsewhere). I couldn't find anything looking at the docs, but I
suspect this is due to not knowing what to look for.

My code:

import org.jdom.*;
import org.jdom.JDOMException Source code of org.jdom.JDOMException;
import org.jdom.input.SAXBuilder Source code of org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter Source code of org.jdom.output.XMLOutputter;
import java.io.IOException Source code of java.io.IOException;

public class Parse {

public static void main (String [] args) {

   SAXBuilder builder = new SAXBuilder();
   Document doc;
   XMLOutputter outputter = new XMLOutputter();

   try {
     doc = builder.build("/path/to/about.xhtml");
     System.out.println(" is well formed.");
     try {
       outputter.output(doc, System.out);
     } catch (IOException e) {
       System.err.println(e);
     }
   } catch (JDOMException e) {
     // indicates a well-formedness or other error
     System.out.println(" is not well formed: " + e.getMessage());
   } catch (IOException e) {
     System.out.println("Could not check ");
     System.out.println(" because " + e.getMessage());
   }
 }
}



Examples:
For input of:

<?xml version="1.0" encoding="ISO-8859 (See http://ISO-8859.ora-code.com)-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html
xmlns="http://www.w3.or
g/1999/xhtml" xml:lang="en">
<head>
<title>About</title>
etc

It outputs:

<?xml version="1.0" encoding="UTF-8 (See http://UTF-8.ora-code.com)"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" [
 <!NOTATION w3c-xml PUBLIC "ISO 8879//NOTATION Extensible Markup
Language (XML) 1.0//EN">
 <!NOTATION cdata PUBLIC "-//W3C//NOTATION XML 1.0: CDATA//EN">
 <!NOTATION fpi PUBLIC "ISO 8879:1986//NOTATION Formal Public
Identifier//EN">
 <!NOTATION length PUBLIC "-//W3C//NOTATION XHTML Datatype:
Length//EN">
 <!NOTATION linkTypes PUBLIC "-//W3C//NOTATION XHTML Datatype:
LinkTypes//EN">
 <!NOTATION mediaDesc PUBLIC "-//W3C//NOTATION XHTML Datatype:
MediaDesc//EN">
 <!NOTATION multiLength PUBLIC "-//W3C//NOTATION XHTML Datatype:
MultiLength//EN">
 <!NOTATION number PUBLIC "-//W3C//NOTATION XHTML Datatype:
Number//EN">
 <!NOTATION pixels PUBLIC "-//W3C//NOTATION XHTML Datatype:
Pixels//EN">
 <!NOTATION script PUBLIC "-//W3C//NOTATION XHTML Datatype:
Script//EN">
 <!NOTATION text PUBLIC "-//W3C//NOTATION XHTML Datatype: Text//EN">
 <!NOTATION character PUBLIC "-//W3C//NOTATION XHTML Datatype:
Character//EN">
 <!NOTATION charset PUBLIC "-//W3C//NOTATION XHTML Datatype:
Charset//EN">
 <!NOTATION charsets PUBLIC "-//W3C//NOTATION XHTML Datatype:
Charsets//EN">
 <!NOTATION contentType PUBLIC "-//W3C//NOTATION XHTML Datatype:
ContentType//EN">
 <!NOTATION contentTypes PUBLIC "-//W3C//NOTATION XHTML Datatype:
ContentTypes//EN">
 <!NOTATION datetime PUBLIC "-//W3C//NOTATION XHTML Datatype:
Datetime//EN">
 <!NOTATION languageCode PUBLIC "-//W3C//NOTATION XHTML Datatype:
LanguageCode//EN">
 <!NOTATION uri PUBLIC "-//W3C//NOTATION XHTML Datatype: URI//EN">
 <!NOTATION uris PUBLIC "-//W3C//NOTATION XHTML Datatype: URIs//EN">
]>
<?doc type="doctype" role="title" { XHTML 1.1 } ?><html
xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" version="-//W3C//DTD
XHTML 1.1//EN">
<head profile="">
<title>About</title>

etc

--
David Dorward                                 <http://dorward.me.uk/>
__ ____ ____ ____ ____ ____ ____ ____ ____ ____
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@(protected)
.com

©2008 junlu.com - Jax Systems, LLC, U.S.A.