The Evolution of Typography: From Bitmap Fonts to Vector Outfits and PDF Subsetting
By Abdullah Taha
In the early days of computers, fonts were represented as simple grids of pixels (bitmap fonts). If you zoomed in or changed the font size, the characters became heavily pixelated and unreadable. In the late 1980s, Adobe introduced Type 1 fonts, and Apple and Microsoft countered with TrueType (.ttf), transitioning typography from static pixels to vector mathematics. In a vector font, each character (glyph) is represented as a series of Bézier curves and lines. This allows the renderer to scale the font to any size losslessly, recalculating the outline pixels dynamically.
When a PDF document is compiled, embedding entire font files (which can contain thousands of glyphs and take up megabytes of space) is highly inefficient. To solve this, compilers use a technique called Font Subsetting. A font subsetting algorithm parses the document, lists the exact characters used (e.g., if only 'A', 'e', and 't' are used, it discards all other glyphs), and generates a stripped-down version of the font containing only those specific outlines. This subset is then embedded inside the PDF's `/Font` dictionary.
This subsetting process is crucial for keeping PDF files small, but it introduces a major challenge for editing. If you try to modify a PDF containing a subsetted font and add a character that wasn't in the original text (like 'z'), the viewer won't have the vector data for that glyph, resulting in blank spaces or broken symbols. In TellPDF, our client-side editor analyzes the embedded font dictionaries and automatically falls back to standard system fonts or downloads vector subsets as needed to ensure that edits are rendered correctly without corrupting the file's layout.