DivX Plus MKV extension: World Fonts
One of the benefits of authoring content in MKV is the ability to include multiple subtitle tracks so viewers in different geographies can enjoy your work. Some uncertainties accompanying multilingual subtitle tracks include whether viewers will have suitable fonts installed to view the content, whether the subtitles will appear stylistically correct on every system, and what happens when someone tries to play the file on a device that may not be capable of displaying all the languages you've included?
DivX Plus offers an extension to the MKV container called World Fonts that solves these problems by authoring appropriate TrueType fonts into the file in a manner such that all players can reliably load them and display subtitles for every language. A special optimization technique is used to significantly reduce the size of the font data so fonts can be loaded not only by desktop players but also by DivX Plus Web Player and hardware devices with limited memory.
See it in action
We encoded Blender Foundation's Sintel trailer with four subtitle languages, including complex east-asian languages like Chinese, each version with and without the World Fonts extension. The subtitles that include World Fonts will be displayed correctly even if your local system or player does not support them. Note that DivX Web Player 2.0.2 contains preliminary support for this feature and this will become more optimal in future.
The available subtitles are:
- French with World Fonts
- Chinese (Not yet natively supported)
- Chinese with World Fonts
- Japanese (Not yet natively supported)
- Japanese with World Fonts
- Russian with World Fonts
Select a subtitle by right-clicking the video as it plays. Notice that even though Chinese and Japanese are not yet natively supported by DivX Plus Web Player they display correctly when World Fonts are applied.
You can also download this file and play it in the DivX Plus Player.
How it works
The World Fonts extension allows the content author to associate a specific TrueType font with each individual subtitle track. The font data is then optimized by scanning through each subtitle track and identifying all of the characters and symbols that need to be displayed, then removing anything unnecesary. Each font is then embedded in the .mkv file and associated with the subtitle track that it has been optimized for. The reduced data size means that World Fonts can be used with DivX Plus Web Player without causing lengthy buffering and by devices with limited memory.
How effective is optimization?
Subtitles in most Western European languages that use a latin alphabet (e.g. English, French, German, Italian, Spanish and so on) typically require between 60 to 110 glyphs after optimization. One font often used by desktop software to render subtitles is Arial, and by comparison it contains well over 1600 glyphs. Optimizing Arial for a Western European language subtitle typically yields data savings of 86-92%.
Asian languages use a far larger number of glyphs and fonts supporting these languages can be massive. For example, Arphic's "PL KaitiM" for Traditional Chinese is nearly 10MB large with over 14,000 glyphs. Fonts this large simply can't be loaded by many devices. However, only a fraction of these glyphs are necessary to display subtitle tracks for languages like Chinese and Japanese.
Complete Unicode fonts, i.e. those that support all languages, are larger still. For example, the well-known Arial Unicode contains more than 50,000 glyphs comprising over 22MB of data. In these cases optimization for many languages will yield savings of more than 99%.
Adding World Fonts to a .mkv file
To add a World Fonts to an .mkv file you need a utility to perform the font optimization and a file writer that fully implements the requirements of the extension. DivX provides a reference implementation in the DivXMKVMux sample tool. Begin by downloading and installing the DivXMKVMux package.
For easy access to DivXMKVMux launch a command console via the shortcut provided in the DivX Plus programs group on your Start menu. The console will start in the installation directory and the installation directory is automatically added to the PATH environment variable for the console session so that you can type "DivXMKVMux" from any other directory.
You can access the full list of arguments for DivXMKVMux using:
The help output is lengthy because DivXMKVMux is fairly flexible, but the required syntax is actually quite straightforward. There are several different ways DivXMKVMux can be used to add an optimized font to an .mkv file. If you haven't already added subtitles to your .mkv file it's easy to add the tracks and optimized fonts in one straightforward operation. Use -r to indicate you're remuxing an existing .mkv file and then specify subtitle files and associated font files with -s:
Specifying names for the subtitle tracks is optional but you must specify the correct language code or the tracks may not be listed correctly in some players. You can lookup valid alpha-3 language codes in the ISO 639.2 code-list. Remember to quote ("") any filenames containing spaces.
If you are starting with a .mkv file that already contains subtitle tracks first list the existing tracks so that you can see the list of track numbers as interpreted by DivXMKVMux:
Vid: 1 video stream(s)
0 - master video track
Aud: 1 audio streams(s)
0 - eng
Sub: 3 subtitle stream(s)
0 - eng
1 - fre
2 - ger
Then use -r to remux the file, specifying an appropriate font for each subtitle track with -f:
For each subtitle track in the input .mkv file DivXMKVMux will optimize the specified font and then add it to the output .mkv file.
Note that it is strongly recommended to pass only UTF-8 encoded .srt files to DivXMKVMux, especially for non-Latin languages, because this prevents the risk that other encodings are not transformed to UTF-8 correctly internally. If you pass .srt files to DivXMKVMux that don't have a unicode BOM they are assumed to be ANSI and the active system codepage will be used to convert them to UTF-8 unless you explicitly override the codepage number using the -s codepage: option. Generally, if the SRT does not display correctly when opened in Notepad the file is probably not UTF-8 and the active codepage is unsuitable for the conversion.
You can easily convert the encoding of any input .srt file to UTF-8 using Notepad++, a free text editor, by performing these steps:
- Open your .srt file.
- Check if the highlighted option in the "Encoding" menu is "Encode in UTF-8". If it is then the file is already UTF-8.
- From the Encoding menu choose an appropriate character set, verify the text is displayed correctly, then choose "Convert To UTF-8".
- Save your .srt file.
You can also avoid this process by creating your .srt files with UTF-8 encoding originally.
Support for World Fonts
Initial support of the World Fonts extension is included in DivX Plus Player and DivX Web Plus Player with these known-issues:
- In the DivX Plus Player 8.1 an attached font may not be loaded if its face style is different than the style set in the player preferences. For example, your preferences are to use an italic face but the attached font has a regular face.
World Fonts are supported in all DivX Plus HD devices.
Let us know what you think
Please send us your comments and questions on this feature via the DivX Plus (.mkv) forum (requires sign-in).
This section of the article describes the technical implementation of the World Fonts feature and assumes the reader is familiar with MKV EBML structure. Less technical readers may wish to skip this section.
The following diagrams illustrate how World Fonts are represented in DivX Plus HD .mkv files. At a high level:
- The extension uses various pre-existing MKV EBML element IDs.
- Some of these elements must take on specific values. Consult the syntax table below for further detail.
- Fonts are stored as attachments, and must be explicitly associated with the specific subtitle TrackEntry for which they have been optimized by use of an AttachmentLink element in the TrackEntry that references the Attachment FileUID. Only once this reference is successfully resolved is a relationship established.
Additionally, the following restrictions apply to fonts:
- The font must be a TrueType (.ttf) font type.
- The font must contain a table with a Microsoft Unicode character map ("cmap" table entry with Platform ID=1, Encoding ID=3).
The EBML elements for World Fonts are:
[+] Click to expand Key
|Element Name||The friendly name for the EBML ID.|
|Lvl||The level in the container hierarchy that this element appears, with Segment at level 0.|
|EBML ID||The unique EMBL ID for the element.|
|Ma||Whether or not the element is mandatory in an .mkv file if its parent element is present. This is different than whether or not the element must be present for this extension to work.|
|Mu||Whether or not the element may appear multiple times within a single instance of its parent.|
|Rng||Any restrictions of the range of values for the element payload.|
|Def||The default value for the element. Elements with default values are interpreted as present and assume their default value if their parent element is read but they are not present in the file.|
|Type||The element type. Types in this table are:
|D+||Whether or not the element is new in this DivX Plus extension.|
|Description||A description of the element.|
|Element Name||Lvl||EBML ID||Ma||Mu||Rng||Def||Type||D+||Description|
|TrackType||3||||*||-||1-254||-||u||-||The type of this track. Value must be 0x11 for subtitles.|
|AttachmentLink||3||||-||-||>0||-||u||-||The FileUID of the AttachedFile that contains the TrueType font that has been optimized for this subtitle track.|
|FileDescription||3||[7E]||-||-||-||-||8||-||For this extension must be "true type font".|
|FileName||3||[6E]||*||-||-||-||8||-||The file name of the original TTF font file.|
|FileMimeType||3||||*||-||-||-||s||-||For this extension must be "application/x-truetype-font".|
|FileData||3||[5C]||*||-||-||-||b||-||The binary contents of the optimized TrueType font.|
|FileUID||3||[AE]||*||-||>0||-||u||-||A unique ID for this attachment. This ID must be referenced by the AttachmentLink element of the subtitle TrackEntry for which this TrueType font file has been optimized.|
|FileUsedStartTime||3||||-||-||-||-||u||*||The timecode at which this optimized font attachment comes into context, based on the Segment TimecodeScale. This element is reserved for future use and if written must be the segment start time.|
|FileUsedEndTime||3||||-||-||-||-||u||*||The timecode at which this optimized font attachment goes out of context, based on the Segment TimecodeScale. This element is reserved for future use and if written must be the segment end time.|