----- Original Message -----
Sent: Thursday, July 20, 2006 8:43
AM
Subject: Re: [iText-questions] Slightly
OT: Does anyone have a way to determine the language or even codepage of a
PDF?
At 10:40 PM 7/19/2006, Aaron J Weber wrote:
I basically am trying to filter PDFs to see if they're a
non-Latin-based language (Japanese, Korean, Chinese to name a
few).
Thanks for
any
hints/tips/suggestions.
If
I were trying to tackle this problem, I would simply find all fonts in the
document and examine their glyph sets and encodings to find any non-Roman
ones. Looking at the fonts would only tell you about POSSIBLE usage -
you'd need to examine the actual contents to determine REAL
usage.
Leonard
---------------------------------------------------------------------------
PDF Sages,
Inc.
215-938-7080 (voice)
215-938-0880 (fax)