PDF made easy with iText 7
Transcript of PDF made easy with iText 7
PDF made easy with iText 7What’s new in iText and iTextSharp?
Benoit Lagae, Developer, iText SoftwareBruno Lowagie, Chief Strategy Officer, iText Group
Why did we write iText?
• Specific problems that needed to be solved– Emancipate PDF from the desktop to the server
• Solved in 1998 with a first PDF library• Deep knowledge of PDF required
– Make PDF creation easier for developers• Solved in 2000 with the release of iText• Concept: PdfWriter and Document• Add high-level objects (e.g. paragraph, list, table)
History
• First release: 2000• iText 1: 2003• iText 2: 2007• iText 5: 2009; upgrade to Java 5• iText 7: 2016; upgrade to Java 7
iText is available for Java and .NET
Why iText 7?iText 5 was approaching the limits of its architecture.iText 7 overcomes these limits and enables further user-driven feature development and more efficient support• Complete revision of all classes and interfaces based on experience
with iText 5.• Complete new layout module, which resolves some inconsistencies
in iText 5 and enables generation of complex layouts.• Complete rewrite of font support enabling advanced typography.
iText 7: modular approach
Basic design principleOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);// PDF knowledge needed to add contentpdf.close();
OutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);Document document = new Document(pdf);// No PDF knowledge needed to add contentdocument.close();
iText’s basic building blocks: examples
Hello world: codeOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);Document document = new Document(pdf);document.add(new Paragraph("Hello World!"));document.close();
Hello world: result
Hello world: the hard wayFileOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);PageSize ps = PageSize.A4;PdfPage page = pdf.addNewPage(ps);PdfCanvas canvas = new PdfCanvas(page);canvas.beginText() .setFontAndSize( PdfFontFactory.createFont(FontConstants.HELVETICA), 12) .moveText(36, 790) .showText("Hello World!") .endText();pdf.close();
List example: code// Create a PdfFontPdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);// Add a Paragraphdocument.add(new Paragraph("iText is:").setFont(font));// Create a ListList list = new List() .setSymbolIndent(12) .setListSymbol("\u2022") .setFont(font);// Add ListItem objectslist.add(new ListItem("Never gonna give you up")) .add(new ListItem("Never gonna let you down")) .add(new ListItem("Never gonna run around and desert you")) .add(new ListItem("Never gonna make you cry")) .add(new ListItem("Never gonna say goodbye")) .add(new ListItem("Never gonna tell a lie and hurt you"));// Add the listdocument.add(list);
List example: result
Image exampleImage fox = new Image(ImageFactory.getImage(FOX));Image dog = new Image(ImageFactory.getImage(DOG));Paragraph p = new Paragraph("Quick brown ").add(fox) .add(" jumps over the lazy ").add(dog);document.add(p);
New in iText 7:improved typography
and support for Indic scripts
iText 5: missing links
Indic scripts:• Only unsupported major script family• Feature request #1• Huge opportunity
• limited support in most other PDF libraries
Other features:• Optional ligatures in Latin script• Vowel diacritics in Arabic
Indic scripts: problems•Lack of expertise
• Unicode encodes 49 Indic scripts• Complex scripts with unique features
• Glyph repositioning: ह + ि� = हिह• Glyph substitution: ம + ு� = மு• Half-characters: त + �� + य = त्य
•Unsolvable issues for iText 5 font engine• No dedicated Unicode points for half-characters• No font lookups past ‘\uFFFF’• Ligaturization is context-dependent (virama)
Indic scripts: solutions
Writing a new font engine• Automatic script recognition
• Based on Unicode ranges
• Flexibility = extensibility• Generic Shaper class • Separate module, only called when necessary
• Glyph replacement rules• Different per writing system• Alternate glyphs are font-dependent
Indic scripts: examplesPdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);String txt = "\u0938\u093E\u0939\u093F\u0924\u094D\u092F\u0915\u093E\u0930"; // saahityakaardocument.add(new Paragraph(txt).setFont(font));
String txt = "\u0B8E\u0BB4\u0BC1\u0BA4\u0BCD\u0BA4\u0BBE\u0BB3\u0BB0\u0BCD"; // eluttaalardocument.add(new Paragraph(txt).setFont(font));
Other scripts: examplesPdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);String txt = "\ u0627\u0644\u0643\u0627\u062A\u0628"; // al-katibudocument.add(new Paragraph(txt).setFont(font));
String txt = "writer"; GlyphLine glyphLine = font.createGlyphLine(txt);Shaper.applyLigaFeature(foglihtenNo07, glyphLine, null);canvas.showText(glyphLine)
Status of advanced typography in iText 7
•Indic scripts• We already support:
• Devanagari• Tamil
• Coming soon:• Telugu• Others: based on customer demand
•Arabic• Support for vocalized Arabic (diacritics) is in development
•Latin• Optional ligatures are fully supported
Real-world use:Publishing a database
CSV example
Imagine a series of records
Parse CSV line by lineOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer); Document document = new Document(pdf, PageSize.A4.rotate());document.setMargins(20, 20, 20, 20);PdfFont font = PdfFontFactory.createFont(FontConstants.HELVETICA);PdfFont bold = PdfFontFactory.createFont(FontConstants.HELVETICA_BOLD);Table table = new Table(new float[]{4, 1, 3, 4, 3, 3, 3, 3, 1});table.setWidthPercent(100);BufferedReader br = new BufferedReader(new FileReader(DATA));String line = br.readLine();process(table, line, bold, true);while ((line = br.readLine()) != null) { process(table, line, font, false);}br.close();document.add(table);document.close();
Process each linepublic void process(Table table, String line, PdfFont font, boolean isHeader) { StringTokenizer tokenizer = new StringTokenizer(line, ";"); while (tokenizer.hasMoreTokens()) { if (isHeader) { table.addHeaderCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } else { table.addCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } }}
CSV: resulting report
Form fillingForm flattening
Example form
Look inside your PDF
Fill the formPdfReader reader = new PdfReader(src);PdfWriter writer = new PdfWriter(dest);PdfDocument pdf = new PdfDocument(reader, writer);PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);Map<String, PdfFormField> fields = form.getFormFields();fields.get("name").setValue("James Bond");fields.get("language").setValue("English");fields.get("experience1").setValue("Off");fields.get("experience2").setValue("Yes");fields.get("experience3").setValue("Yes");fields.get("shift").setValue("Any");fields.get("info").setValue("I was 38 years old when I became an MI6 agent.");pdf.close();
Result after filling
Flatten the formPdfReader reader = new PdfReader(src);PdfWriter writer = new PdfWriter(dest);PdfDocument pdf = new PdfDocument(reader, writer);PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);Map<String, PdfFormField> fields = form.getFormFields();fields.get("name").setValue("James Bond");fields.get("language").setValue("English");fields.get("experience1").setValue("Off");fields.get("experience2").setValue("Yes");fields.get("experience3").setValue("Yes");fields.get("shift").setValue("Any");fields.get("info").setValue("I was 38 years old when I became an MI6 agent.");form.flattenFields();pdf.close();
Result after flattening
Form flatteningMerging
United States: Example form
Flatten and mergePdfDocument destPdfDocument = new PdfDocument(new PdfWriter(dest));BufferedReader bufferedReader = new BufferedReader(new FileReader(DATA));String line;while ((line = bufferedReader.readLine()) != null) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); PdfDocument sourcePdfDocument = new PdfDocument(new PdfReader(SRC), new PdfWriter(baos)); PdfAcroForm form = PdfAcroForm.getAcroForm(sourcePdfDocument, true); StringTokenizer tokenizer = new StringTokenizer(line, ";"); Map<String, PdfFormField> fields = form.getFormFields(); fields.get("name").setValue(tokenizer.nextToken()); form.flattenFields(); sourcePdfDocument.close(); sourcePdfDocument = new PdfDocument( new PdfReader(new ByteArrayInputStream(baos.toByteArray()))); sourcePdfDocument.copyPagesTo(1, sourcePdfDocument.getNumberOfPages(), destPdfDocument, null); sourcePdfDocument.close();}bufferedReader.close();destPdfDocument.close();
The result(and why we don’t like it)
Flatten and mergePdfWriter writer = new PdfWriter(dest).setSmartMode(true);PdfDocument destPdfDocument = new PdfDocument(writer);BufferedReader bufferedReader = new BufferedReader(new FileReader(DATA));String line;while ((line = bufferedReader.readLine()) != null) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); PdfDocument sourcePdfDocument = new PdfDocument(new PdfReader(SRC), new PdfWriter(baos)); PdfAcroForm form = PdfAcroForm.getAcroForm(sourcePdfDocument, true); StringTokenizer tokenizer = new StringTokenizer(line, ";"); Map<String, PdfFormField> fields = form.getFormFields(); fields.get("name").setValue(tokenizer.nextToken()); form.flattenFields(); sourcePdfDocument.close(); sourcePdfDocument = new PdfDocument( new PdfReader(new ByteArrayInputStream(baos.toByteArray()))); sourcePdfDocument.copyPagesTo(1, sourcePdfDocument.getNumberOfPages(), destPdfDocument, null); sourcePdfDocument.close();}bufferedReader.close();destPdfDocument.close();
The result(much better than before)