Java Mailing List Archive

http://www.junlu.com/

Home » Home (12/2007) » JDOM User »

[jdom-interest] Kana symbols and UTF-8? (was Re: Kana characters?)

Alan Deikman

2007-05-22

Replies:

OK, now I'm a little confused.???? I guess this is an XML question and not really a JDOM question, but perhaps someone can explain it.

Angela Amoateng wrote:

This is the code in my XML document (by the way, romaji is romanised Japanese):

<?xml version="1.0" encoding="UTF-8"?>

<dictionary>
???? <word>
???????????? <noun>
???????????????????? <english>book</english>
???????????????????? <romaji>hon</romaji>
???????????????????? <hiraganaSym>??????</hiraganaSym>
???????????????????? <hiraganaNum>&#x307B;&#x3093;</hiraganaNum>
???????????? </noun>

Where I get lost is in the <hiriganaSym> tag.???? Those characters inside are not part of any 8-bit code (ASCII, UTF-8 or whatever).?? Java has no problem with it because all String objects are built on unicode, but what does the encoding="UTF-8" mean in the header if these symbols can show up in the document?

-- 
Alan Deikman
ZNYX Networks
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)
©2008 junlu.com - Jax Systems, LLC, U.S.A.