Skip to content
Eliot Jones edited this page Jan 12, 2019 · 36 revisions

PdfPig Wiki

This wiki contains more detail on various aspects of the public API and the PDF document format.

Release Schedule

0.0.1 - Very Stable Genius

Released - first version.

0.0.2

Released - small fixes for TrueType fonts.

0.0.3 - Red Cake with Great Big Red Cherries

November 2018:

  • Reworks the public API of Letter to provide height information. See the Letters page on the wiki.
  • Adds support for Type 1 fonts with Compact Font Format fonts and retrieving height information.
  • Bug fixes, stability improvements and performance improvements.
  • PdfDocument now has a Structure property. This is an UglyToad.PdfPig.Structure object which provides access to the tokenized content of the PDF file and the merged Cross Reference Table in the document. Any objects in the PDF file may be accessed by object reference number allowing consumers to work around missing functionality. All tokens used internally when interpreting PDF documents are available on the public API.
  • Page now has a IEnumerable<Word> GetWords() method which uses a default word extractor to attempt merging letters into words based on heuristics using letter positions. Consumers may provide their own IWordExtractor to the method to improve on the very basic approach used in this release or continue using the raw letters.

0.0.5 - Cows In The South

December 2018:

  • Adds the ability to create new PDF documents. Supports custom fonts, text, geometry and colors.
  • Enables retrieval of annotation objects from within the PDF document.
  • Supports writing any tokens to streams using TokenWriter enabling users to create their own document writer code if their use cases are unsupported.
  • Supports correct retrieval of system fonts (fonts not included in the document but expected to be present on the user's operating system) for calculating letter sizes.

0.0.6 - Cows In The North

TBC (~Feb 2018):

  • Many more stability fixes for TrueType, Type 1 and Compact Font Format fonts and other bugs with reading document content. Now correctly handles composite glyphs in a TrueType font.
  • Makes the entire content stream of the page public. This can be accessed using page.Operations which is the ordered list of the operations forming the page's content stream. There are roughly 70 operators defined as of PDF 1.7, consult the specification for details on the behavior of each one.
  • Makes the content stream writable for the PdfPageBuilder. This is accessed using builder.Advanced.Operations supporting the creation of any content in the page's content stream.
  • Adds access to the document's AcroForm, these are forms with checkboxes, dropdowns and textboxes which support custom user input. Supports access the individual fields of AcroForms and their values. Use PdfDocument.GetForm() to access the form object. This will be null if the document does not have a form, each document contains at most 1 form.
  • Adds support for .NET 4.5, .NET 4.5.1, .NET 4.5.2, .NET 6.0, .NET 6.1, .NET 6.2 and .NET 7.0 in addition to the current NET Standard 2.0 support.
  • Fixes a bug where documents using Standard 14 fonts with a custom encoding couldn't be opened.
  • Fixes a bug where a document with a font dictionary that does not specify /Type /Font could not be opened.
Clone this wiki locally