-
Notifications
You must be signed in to change notification settings - Fork 242
Home
Eliot Jones edited this page Jan 12, 2019
·
36 revisions
This wiki contains more detail on various aspects of the public API and the PDF document format.
Released - first version.
Released - small fixes for TrueType fonts.
November 2018:
- Reworks the public API of
Letter
to provide height information. See the Letters page on the wiki. - Adds support for Type 1 fonts with Compact Font Format fonts and retrieving height information.
- Bug fixes, stability improvements and performance improvements.
- PdfDocument now has a
Structure
property. This is anUglyToad.PdfPig.Structure
object which provides access to the tokenized content of the PDF file and the merged Cross Reference Table in the document. Any objects in the PDF file may be accessed by object reference number allowing consumers to work around missing functionality. All tokens used internally when interpreting PDF documents are available on the public API. - Page now has a
IEnumerable<Word> GetWords()
method which uses a default word extractor to attempt merging letters into words based on heuristics using letter positions. Consumers may provide their ownIWordExtractor
to the method to improve on the very basic approach used in this release or continue using the raw letters.
December 2018:
- Adds the ability to create new PDF documents. Supports custom fonts, text, geometry and colors.
- Enables retrieval of annotation objects from within the PDF document.
- Supports writing any tokens to streams using
TokenWriter
enabling users to create their own document writer code if their use cases are unsupported. - Supports correct retrieval of system fonts (fonts not included in the document but expected to be present on the user's operating system) for calculating letter sizes.
TBC (~Feb 2018):
- Many more stability fixes for TrueType, Type 1 and Compact Font Format fonts and other bugs with reading document content. Now correctly handles composite glyphs in a TrueType font.
- Makes the entire content stream of the page public. This can be accessed using
page.Operations
which is the ordered list of the operations forming the page's content stream. There are roughly 70 operators defined as of PDF 1.7, consult the specification for details on the behavior of each one. - Makes the content stream writable for the
PdfPageBuilder
. This is accessed usingbuilder.Advanced.Operations
supporting the creation of any content in the page's content stream. - Adds access to the document's AcroForm, these are forms with checkboxes, dropdowns and textboxes which support custom user input. Supports access the individual fields of AcroForms and their values. Use
PdfDocument.GetForm()
to access the form object. This will benull
if the document does not have a form, each document contains at most 1 form. - Adds support for .NET 4.5, .NET 4.5.1, .NET 4.5.2, .NET 6.0, .NET 6.1, .NET 6.2 and .NET 7.0 in addition to the current NET Standard 2.0 support.
- Fixes a bug where documents using Standard 14 fonts with a custom encoding couldn't be opened.
- Fixes a bug where a document with a font dictionary that does not specify /Type /Font could not be opened.