TrueType subsets landed in SetaPDF2018-01-31
The first release in 2018 is a big one. In the last months we did an exhausting job on a development branch named "font-subsetting". Finally, this branch got merged back into the master coming with a fast and less memory intensive TrueType font subsetting engine written in pure PHP.
This engine allows you to use any kind of TrueType font in any SetaPDF component when dealing with text. This was already possible before, while embedding the whole font file, but now the font program will automatically be subset to the really needed and used glyphs. You, the developer, don't have to care about this at all but the whole process is done completely seamless and silent in the background.
With this engine you're able to use a wide range of languages and scripts without taking care about the resulting file size. In general this engine allows you to pick and use every character of the font program but the internal rendering process is currently limited to left-to-right languages and scripts which do not need further (pre-)processing (such as Arabic, Hebrew...).
Nearly all our online demos are up to date now and use TrueType fonts (we use the free font DejaVu for demonstration purpose) throughout to not limit the text input to the default available encoding. The usage of TrueType fonts is documented here.
Beside this feature this release comes with several bug fixes and tweaks, as you will see in the release notes below.
Check the release notes of the components below.
Log in to download the latest version of the related packages!
Release date: 2018-01-29
- Implemented re-calculation of font bounding box for Type0 fonts with a TrueType font program.
- Added SetaPDF_Core_Font_Type0_Subset class.
- Added ToUnicode creation class.
- Added SetaPDF_Core_Font_TrueType_Subset class.
- Increased default byte length from 1024 to 5500 bytes in which the component will search for the initial "startxref" keyword.
- Introduced SetaPDF_Core_Font_FontInterface and updated all related type hints accordingly.
- Removed getGlyphsWidthByCharCodes() method and internally used property from all font classes.
- Ignore broken indirect object references when resolving terminal fields in Fields array of the AcroFom dictionary.
- Fixed bug in TrueType "cmap" (segmented coverage/format 12) reading.
- Fixed bug in TrueType "name" table reading.
- Prevent warning if document metadata package is empty.
- Ensured encoding object type in simple font class.
- Handle reading of direct objects without a valid PDF value (throw an exception).
- Cache calculated font bounding boxes for TrueType and Type0 fonts.
- Jump to a more logical byte offset if parsing of a cross-reference table fails.
- Removed creation of name objects by string type from SetaPDF_Core_Type_AbstractType::writePdfString().
- Optimized SetaPDF_Core_Encoding::utf16BeToUnicodePoint().
- Ignore invalid range values in CMAP parser.
- Forward document instance in SetaPDF_Core_Type_Dictionary::_handlePdfStringCallback() calls.
- Optimized word (or better space) counting in text graphic state class.
- Optimized exception messages in horizontal metrics table class of the TrueType parser.
- Ignore hybrid cross-references if /XRefStm points to an invalid byte offset.
- Optimized handling of corrupted documents.
- String reader uses PHP streams internally now.
- Refactored PNG file handling through PHP streams instead of strings to reduce memory usage.
- Fixed incorrect return type hints in doc blocks.
- Added support for reading of malformed XMP metadata packages.
- Added "Accept-Ranges: none" header to both HTTP writer classes.