Hi,
I’m using iTextSharp for a year now
in my corporation to generate PDF document with no problem.
Now, I would like to include in this
“home-made” PDF documents some comments that came from HTML pages.
After some investigation I found HTLMWorker that seems to do it (HTMLParser is
not good for me as it seems that it make an independent pdf file).
I’ll try it very simply :
Document document = new Document();
PdfWriter writer = PdfWriter.GetInstance(document,
new FileStream(@"E:\test.pdf", FileMode.Create));
document.Open();
StyleSheet styles = new StyleSheet();
ArrayList list = HTMLWorker.ParseToList(new
StringReader(_postIt.Contents), styles); //_postIt.Contents is a string that contains my HTML
for (int k = 0; k <
list.Count; ++k)
document.Add((IElement)list[k]);
document.Close();
You can find the PDF result here : http://petoulachi.coldwire.net/datas/PDF/test.pdf
And the HTML page : http://petoulachi.coldwire.net/datas/PDF/page.html
This work well, but I have a little
problems :
-the HTML contains a <STYLE> tag in
the <HEAD> section, and unfortunately the text appear on the PDF (for
instance, if I have a <STYLE type=text/css>body{font-family: tahoma;
font-size=.8em}p{margin:0; padding:0}</STYLE> tag on the HTML, I will
find a ‘body{font-family: tahoma; font-size=.8em}p{margin:0;
padding:0}’ string in my PDF document)
-<DIV> tag seems to make HTMLWorker a little lost (see the PDF File given, I’m not confortable in
English L )
More, I’ve see the StyleSheet class. Can I use it to ignore some tags on my HTML page (something
like a <DIV class=”ignoreMe”>) ?
Thanks for the help !