Java Mailing List Archive

http://www.junlu.com/

Home » Home (12/2007) » iText »

Re: [iText-questions] xml -> pdf or html

Steve Appling

2005-06-24

Replies:

A few suggestions - YMMV:

My main use for iText has been in the context of a Web Application. We
chose to take our raw data, process it using
a SAX parser, and render it into a layout format as XHTML. We can then
present the XHTML document to the user in a browser for preview. The
temporary XHTML layout document is then parsed as XML (again using a SAX
parser) and converted into PDF using iText.

We use XSL extensively in our application, but chose not to in this
case. A few cautions. XSL transform based solutions don't scale well
to very large documents. For simple tasks, XSL can be very
straightforward to write and maintain, but very complicated processing
tasks are perhaps better handled in a more traditional programming
language. If your input is XML and your output is some form of text,
then XSL can probably be made to do it, but it's not always the best
solution.

When information in both PDF and in HTML, be aware of some differences
of intention in the two formats. PDF is intended to accurately describe
a page oriented layout. It expects all decisions about what fits in a
particular space to have already been made. HTML is intentionally much
more forgiving. An HTML viewer (browser) is expected to handle all of
the reflow issues encountered when presenting HTML to different sized
containers. In PDF, these layout decisions need to be made at the time
the PDF is generated, not when it is viewed. If you are trying to have
HTML and PDF that look exactly alike, then you need to be aware of
this. For example - by default, columns in HTML tables will expand if
you are adding block content that won't fit (even if you have set a
fixed column size). Fonts of the same point size may not take up the
same amount of space in your PDF and HTML versions. Defaults for padding
and spacing in tables will be different and the way cell borders are
drawn was different (until we added some features to iText).

We also had to deal with the fact that PDF is laying out content for a
page at a time. You don't want to just clip tables at arbitrary points
when you encounter a page boundary, so you need to be concerned about
whether a logical chunk of information will fit in a table and how to
draw borders for the top and bottom of continued pieces of big tables.
In HTML you don't have to (can't) worry about page boundaries.

It can, be done, though. I think we ended up with results that look
very similar (if not exactly identical) in HTML and PDF.

Good luck


Doug James wrote:

> * Request for anyone's comments / suggestions / concerns. *
>
> There is a chance I could be asked to take a home grown XML document
> and create either html or pdf from it. Any pitfalls to watch out for
> or any suggestions on the best way to do the request?
>
> Thanks for any and all comments!
>
> Doug James
>
> ***************************************************************************************
> BENEFITFOCUS.COM CONFIDENTIALITY NOTICE: This electronic message is
> intended only for the individual or entity to which it is addressed
> and may contain information that is confidential and protected by law.
> Unauthorized review, use, disclosure, or dissemination of this
> communication or its contents in any way is prohibited and may be
> unlawful. If you are not the intended recipient or a person
> responsible for delivering this message to an intended recipient,
> please notify the original sender immediately by e-mail or telephone,
> return the original message to the original sender or to
> bfpostmaster@(protected)
> of the original message. Thank you. (BFeComNote Rev. 07/29/2003)
> ***************************************************************************************
>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
iText-questions mailing list
iText-questions@(protected)
https://lists.sourceforge.net/lists/listinfo/itext-questions
©2008 junlu.com - Jax Systems, LLC, U.S.A.