Subject: Re: Help with copying a page without text 2007-10-08 - By Sarath Dorbala
Back Thank you for all your help.
The idea with Vim plugin and editor worked with the PDFs created with iText. I have few PDF documents which do not show any human readable text with Vim. The PDF's created with iText are readable. But when I tried to use PDFReader.getPageContent() with the documents I had, I got the text better. But can I just edit the content and set it to any blank page? I don't know if this is feasible.
Thank you, Sarath Dorbala.
On 10/8/07, Leonard Rosenthol <leonardr@(protected)> wrote: > > Not being a vim user, I was unable to validate this. > > So that's a quite trick that it is using to decompress page content > streams for editing... > > > I stand corrected about it creating invalid PDFs - though I still think it > sets a bad precedent ;). > > > Leonard > > On Oct 7, 2007, at 8:32 PM, William A. Segraves wrote: > > Leonard, you're advice is usually much better quality. > > I regret you did not try the procedure I outlined before you passed > judgement on it. If you had tried the procedure, I expect you would have > found it works just fine for the example that was used. > > Readers interested in the procedure outlined below, despite Leonard's > advice, can consult Sid Steward's excellent Pdftk web site, > www.accesspdf.com, for the inspiration for the below procedure. > > I hasten to add the Leonard is correct about editing a PDF damaging it. > That's why the Pdftk plugin for Vim is used, i.e., to repair the damaged > PDF. It's not a new idea, as it's been on Sid's site since 11Feb2005, > http://www.accesspdf.com/index.php?page=2. > > Best regards, > Bill Segraves > > -- -- Original Message -- -- > *From:* Leonard Rosenthol <leonardr@(protected)> > *To:* Post all your questions about iText here<itext-questions@(protected) .sourceforge.net> > *Sent:* Sunday, October 07, 2007 6:58 PM > *Subject:* Re: [iText-questions] Help with copying a page without text > > > Note that doing so will create an invalid PDF :(. > > Please do NOT try to edit a PDF using text editors... > > Leonard > > > On Oct 7, 2007, at 11:36 AM, wasegraves@(protected) wrote: > > > Not exactly. It's not possible, except when it's possible. > > > > Here's an approach that works for the MyFirstTable example in the > > iText tutorial. > > > > 1. Download and install Vim (This author used Vim 6.3). > > > > 2. Download and install Pdftk (This author used Pdftk 1.12), > > i.e., copy pdftk.exe into the Vim plugin directory. > > > > 3. Download and install the Pdftk plugin in the Vim plugins > > directory, i.e., copy pdftk.vim into the Vim plugin directory. > > > > 4. Open MyFirstTable.pdf with Vim. > > > > 5. Find the text "cell test2" in the second cell in the fourth > > row of the table and delete the text. > > > > 6. Save the PDF as MyFirstTable_revised.pdf. > > > > 7. Open MyFirstTable_revised.pdf with Reader and see that the > > text "cell test2" is now gone. > > > > Note that this procedure uses iText indirectly, as Pdftk is based > > on iText. > > > > Cheers, > > Bill Segraves > > > > -- ---- ------ Original message from Leonard Rosenthol > > <leonardr@(protected)>: -- ---- ------ > > > > > > > No, it's not possible. > > > > > > Nor will you find any PDF library that will offer that due to the > > way > > > that PDF is structured. See the other current discussion about "how > > > to read text on a page"... > > > > > > > > > Leonard > > > > > > On Oct 6, 2007, at 4:28 PM, Sarath Dorbala wrote: > > > > > > > Hello, > > > > > > > > I am pretty new to PDF and iText. I have a situation where I need > > > > to copy a page to another (new) page with no text from the first > > > > page. For example, if i have a table in page 1, I need to copy > > that > > > > to a new page and but not text in that cell. I don't know if this > > > > is possible in iText. > > > > > > > > Thank you. > > <snip>< /BLOCK QUOTE> > > -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- -- > > --- > > This SF.net email is sponsored by: Splunk Inc. > > Still grepping through log files to find problems? Stop. > > Now Search log events and configuration files using AJAX and a > > browser. > > Download your FREE copy of Splunk now >> http://get.splunk.com/ > > __ ____ ____ ____ ____ ____ ____ ____ ____ ____ > > iText-questions mailing list > > iText-questions@(protected) > > https://lists.sourceforge.net/lists/listinfo/itext-questions > > Buy the iText book: http://itext.ugent.be/itext-in-action/ > > > -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > __ ____ ____ ____ ____ ____ ____ ____ ____ ____ > iText-questions mailing list > iText-questions@(protected) > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > > -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> > http://get.splunk.com/__ ____ ____ ____ ____ ____ ____ ____ ____ ____ > iText-questions mailing list > iText-questions@(protected) > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > > > > > -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > __ ____ ____ ____ ____ ____ ____ ____ ____ ____ > iText-questions mailing list > iText-questions@(protected) > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > >
<div>Thank you for all your help.</div> <div> </div> <div>The idea with Vim plugin and editor worked with the PDFs created with iText. I have few PDF documents which do not show any human readable text with Vim. The PDF's created with iText are readable. But when I tried to use PDFReader.getPageContent() with the documents I had, I got the text better. But can I just edit the content and set it to any blank page? I don't know if this is feasible.</div> <div> </div> <div>Thank you,</div> <div>Sarath Dorbala.<br> </div> <div><span class="gmail_quote">On 10/8/07, <b class="gmail_sendername">Leonard Rosenthol</b> <<a href="mailto:leonardr@(protected)">leonardr@(protected)< /a>> wrote:</span> <blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0 .8ex; BORDER-LEFT: #ccc 1px solid"> <div style="WORD-WRAP: break-word">Not being a vim user, I was unable to validate this. <div><br> </div> <div>So that's a quite trick that it is using to decompress page content streams for editing...</div> <div><br> </div> <div>I stand corrected about it creating invalid PDFs - though I still think it sets a bad precedent ;).</div> <div><br> </div> <div>Leonard</div> <div><br> <div> <div><span class="e" id="q_1157f6712646a5a7_1"> <div>On Oct 7, 2007, at 8:32 PM, William A. Segraves wrote:</div><br></span>< /div> <blockquote type="cite"> <div><span class="e" id="q_1157f6712646a5a7_3"> <div><font face="Arial" size="2">Leonard, you're advice is usually much better quality.</font></div> <div><font face="Arial" size="2"></font> </div> <div><font face="Arial" size="2">I regret you did not try the procedure I outlined before you passed judgement on it. If you had tried the procedure, I expect you would have found it works just fine for the example that was used. </font></div> <div><font face="Arial" size="2"></font> </div> <div><font face="Arial" size="2">Readers interested in the procedure outlined below, despite Leonard's advice, can consult Sid Steward's excellent Pdftk web site, <a onclick="return top.js.OpenExtLink(window,event,this)" href= "http://www.accesspdf.com/" target="_blank"> www.accesspdf.com</a>, for the inspiration for the below procedure.</font></div> <div><font face="Arial" size="2"></font> </div> <div><font face="Arial" size="2">I hasten to add the Leonard is correct about editing a PDF damaging it. That's why the Pdftk plugin for Vim is used, i.e., to repair the damaged PDF. It's not a new idea, as it's been on Sid's site since 11Feb2005, <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://www .accesspdf.com/index.php?page=2" target="_blank">http://www.accesspdf.com/index .php?page=2</a>.</font></div> <div><font face="Arial" size="2"></font> </div> <div><font face="Arial" size="2">Best regards,</font></div> <div><font face="Arial" size="2">Bill Segraves</font></div> <blockquote style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px"> <div style="FONT: 10pt arial">-- -- Original Message -- -- </div> <div style="BACKGROUND: #e4e4e4; FONT: 10pt arial"><b>From:</b> <a title= "leonardr@(protected)" onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:leonardr@(protected)" target="_blank">Leonard Rosenthol </a> </div> <div style="FONT: 10pt arial"><b>To:</b> <a title="itext-questions@(protected) .sourceforge.net" onclick="return top.js.OpenExtLink(window,event,this)" href= "mailto:itext-questions@(protected)" target="_blank">Post all your questions about iText here </a> </div> <div style="FONT: 10pt arial"><b>Sent:</b> Sunday, October 07, 2007 6:58 PM< /div> <div style="FONT: 10pt arial"><b>Subject:</b> Re: [iText-questions] Help with copying a page without text</div> <div><br> </div>Note that doing so will create an invalid PDF :(.<br><br >Please do NOT try to edit a PDF using text editors...<br><br>Leonard<br><br><br >On Oct 7, 2007, at 11:36 AM, <a onclick="return top.js.OpenExtLink(window,event ,this)" href="mailto:wasegraves@(protected)" target="_blank"> wasegraves@(protected)</a> wrote:<br><br>> Not exactly. It's not possible, except when it's possible.<br>><br>> Here's an approach that works for the MyFirstTable example in the <br>> iText tutorial. <br>><br>> 1. Download and install Vim (This author used Vim 6.3).<br>><br>> 2. Download and install Pdftk (This author used Pdftk 1.12), <br>> i.e., copy pdftk.exe into the Vim plugin directory. <br>><br>> 3. Download and install the Pdftk plugin in the Vim plugins <br>> directory, i.e., copy pdftk.vim into the Vim plugin directory.<br>><br>> 4. Open MyFirstTable.pdf with Vim.<br>><br> > 5. Find the text "cell test2" in the second cell in the fourth <br>> row of the table and delete the text.<br>><br>> 6. Save the PDF as MyFirstTable_revised.pdf.<br>><br>> 7. Open MyFirstTable_revised.pdf with Reader and see that the <br>> text "cell test2" is now gone.<br>><br>> Note that this procedure uses iText indirectly, as Pdftk is based <br>> on iText .<br>><br>> Cheers,<br>> Bill Segraves<br>><br>> -- ---- ------ Original message from Leonard Rosenthol <br>> <<a onclick="return top.js.OpenExtLink(window,event,this)" href= "mailto:leonardr@(protected)" target="_blank">leonardr@(protected)</a>>: -- -- ---- ----<br>><br>><br>> > No, it's not possible. <br>> ><br>> > Nor will you find any PDF library that will offer that due to the <br>> way<br>> > that PDF is structured. See the other current discussion about "how<br>> > to read text on a page "... <br>> ><br>> ><br>> > Leonard<br>> ><br>> > On Oct 6, 2007, at 4:28 PM, Sarath Dorbala wrote:<br>> ><br>> > > Hello,<br>> > ><br>> > > I am pretty new to PDF and iText. I have a situation where I need <br>> > > to copy a page to another (new) page with no text from the first<br>> > > page. For example, if i have a table in page 1, I need to copy <br>> that<br>> > > to a new page and but not text in that cell. I don't know if this <br>> > > is possible in iText.<br>> > ><br>> > > Thank you.<br>> <snip>< /BLOCK QUOTE><br>> -- ---- ---- ----- -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- <br>> --- <br>> This SF.net email is sponsored by: Splunk Inc.<br>> Still grepping through log files to find problems? Stop.<br>> Now Search log events and configuration files using AJAX and a <br>> browser.<br>> Download your FREE copy of Splunk now >> <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://get .splunk.com/" target="_blank">http://get.splunk.com/</a> <br>> __ ____ ______ __ ____ ____ ____ ____ ____ _____<br>> iText-questions mailing list <br>> <a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto :iText-questions@(protected)" target="_blank">iText-questions@(protected) .sourceforge.net</a><br>> <a onclick="return top.js.OpenExtLink(window,event ,this)" href="https://lists.sourceforge.net/lists/listinfo/itext-questions" target="_blank"> https://lists.sourceforge.net/lists/listinfo/itext-questions</a><br>> Buy the iText book: <a onclick="return top.js.OpenExtLink(window,event,this)" href= "http://itext.ugent.be/itext-in-action/" target="_blank">http://itext.ugent.be /itext-in-action/ </a><br><br><br>-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- -- ---- --<br>This SF.net email is sponsored by: Splunk Inc.<br>Still grepping through log files to find problems? Stop.<br>Now Search log events and configuration files using AJAX and a browser. <br>Download your FREE copy of Splunk now >> <a onclick="return top.js .OpenExtLink(window,event,this)" href="http://get.splunk.com/" target="_blank" >http://get.splunk.com/</a><br>__ ____ ____ ____ ____ ____ ____ ____ ____ ____ <br>iText-questions mailing list<br><a onclick="return top.js.OpenExtLink (window,event,this)" href="mailto:iText-questions@(protected)" target= "_blank">iText-questions@(protected)</a><br><a onclick="return top.js .OpenExtLink(window,event,this)" href="https://lists.sourceforge.net/lists /listinfo/itext-questions" target="_blank"> https://lists.sourceforge.net/lists/listinfo/itext-questions</a><br>Buy the iText book: <a onclick="return top.js.OpenExtLink(window,event,this)" href= "http://itext.ugent.be/itext-in-action/" target="_blank">http://itext.ugent.be /itext-in-action/ </a></blockquote></span></div> <div style="MARGIN: 0px">-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------ -- ---- ---- ------</div><span class="q"> <div style="MARGIN: 0px">This SF.net email is sponsored by: Splunk Inc.</div> <div style="MARGIN: 0px">Still grepping through log files to find problems? <span> </span>Stop.</div> <div style="MARGIN: 0px">Now Search log events and configuration files using AJAX and a browser.</div> <div style="MARGIN: 0px">Download your FREE copy of Splunk now >> <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://get.splunk .com/__ ____ ____ ____ ____ ____ ____ ____ ____ ____" target="_blank"> http://get.splunk.com/__ ____ ____ ____ ____ ____ ____ ____ ____ ____</a></div> <div style="MARGIN: 0px">iText-questions mailing list</div> <div style="MARGIN: 0px"><a onclick="return top.js.OpenExtLink(window,event ,this)" href="mailto:iText-questions@(protected)" target="_blank" >iText-questions@(protected)</a></div> <div style="MARGIN: 0px"><a onclick="return top.js.OpenExtLink(window,event ,this)" href="https://lists.sourceforge.net/lists/listinfo/itext-questions" target="_blank">https://lists.sourceforge.net/lists/listinfo/itext-questions </a></div> <div style="MARGIN: 0px">Buy the iText book: <a onclick="return top.js .OpenExtLink(window,event,this)" href="http://itext.ugent.be/itext-in-action/" target="_blank">http://itext.ugent.be/itext-in-action/</a></div></span> </blockquote></div><br> </div></div><br>-- ---- ---- ---- ---- ---- ------ -- ---- ---- ---- ---- ---- ---- ------<br>This SF.net email is sponsored by: Splunk Inc.<br>Still grepping through log files to find problems? Stop. <br>Now Search log events and configuration files using AJAX and a browser.<br >Download your FREE copy of Splunk now >> <a onclick="return top.js .OpenExtLink(window,event,this)" href="http://get.splunk.com/" target="_blank"> http://get.splunk.com/</a><br>__ ____ ____ ____ ____ ____ ____ ____ ____ ____ <br>iText-questions mailing list<br><a onclick="return top.js.OpenExtLink(window ,event,this)" href="mailto:iText-questions@(protected)"> iText-questions@(protected)</a><br><a onclick="return top.js .OpenExtLink(window,event,this)" href="https://lists.sourceforge.net/lists /listinfo/itext-questions" target="_blank">https://lists.sourceforge.net/lists /listinfo/itext-questions </a><br>Buy the iText book: <a onclick="return top.js.OpenExtLink(window,event ,this)" href="http://itext.ugent.be/itext-in-action/" target="_blank">http:/ /itext.ugent.be/itext-in-action/</a><br><br></blockquote></div><br>
-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ __ ____ ____ ____ ____ ____ ____ ____ ____ ____ iText-questions mailing list iText-questions@(protected) https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/
|
|