Configuring MathFlow for Arbortext

Behind the scenes, MathFlow consists of executable components integrated with Arbortext Editor via ACL script code. The executable components and most of the script code do not depend on the specific document type with which MathFlow is being used. However, some of the script code must deal with the particular way in which MathML elements have been incorporated into a document-level markup language. This script code must be customized to work with specific document types.

Setting up a document type for use with MathFlow consists of the following general tasks:

  • Integrating the MathML DTD into your DTD for the document-level markup. The sample document type (doctype for short) included with MathFlow integrates the MathML DTD into the Axdocbook DTD.
  • Modifying the stylesheet used for screen formatting your doctype in Arbortext to recognize MathML.
  • Adding a file with script code that makes reference to the parent elements that can contain MathML. The sample Axdocbook + MathML doctype allows MathML to be contained only in <equation> and <inlineequation> elements.
  • Optionally modifying the script code that adds the MathFlow user-interface elements to the Arbortext UI to customize it for your doctype.
  • If you are using the MathFlow Composer, you will need to modify any XSL stylesheets for producing XSL-FO output and XHTML output so that they pass through MathML code unchanged. You will also have to modify the definition of your composition pipelines to include the MathFlow Composer filters. If you use Styler stylesheets for composition, you need to add the MathML Styler module (in <custom>/stylermodules/mathml.style) to your stylesheet. To do this, see Customizing composition.
  • If you are using MathFlow Import/Export, you will need to modify your map files to wrap equation markup with the correct parent elements in the document-level markup.
  • Finally, if you are using numerical codes for some entities that do not agree with MathFlow's you may want to set up a character mapping, via a character map file. This will allow you to open your existing documents and have MathFlow translate your codes into the ones MathFlow uses, and vice versa.

We will take a closer look at each of these steps in the next pages. However, before going into details, it is useful to have a broad view of the MathFlow architecture.

Overview — MathFlow architecture

MathFlow installs into the Arbortext custom directory structure. Files under the custom/doctypes apply to a specific doctype (MathFlow installs a sample doctype, axdocbook_math). The remaining files under the custom directory apply to all doctypes. Customizing MathFlow consists of editing certain MathFlow configuration files in the custom directory.

Arbortext looks for the custom directory in its installation directory by default. However, you can use the APTCUSTOM environment variable to instruct Arbortext to work in another location. MathFlow works when installed either way, but you have to remember to set the APTCUSTOM variable if choosing a non-default location.

Also, you need to correct the path to Arbortext's standard stylesheets in the sample stylesheets that include them (these are axdocbook_math-html.xsl and axdocbook_math-fo.xsl in the <custom-path>/doctypes/axdocbook_math directory.) From here on, we will refer to the location of your custom directory simply as custom.

Editing architecture

The basic editing cycle starts with a call to a MathFlow ACL function mathml:insert_math. This call is generated by Arbortext in response to events such as mouse clicks and menu actions. The binding between Arbortext events and MathFlow callback functions is largely set up in custom/init/mathml_init.acl. The code in custom/init/mathml_init.acl is executed when the Arbortext Editor starts, then it reads and executes the commands in custom/scripts/mathml_user.acl. The file mathml_user.acl does not ship with Arbortext, and therefore must be created by the user. Most of the user's additions to the MathFlow code should be performed in the mathml_user.acl file.

The mathml:insert_math function opens the MathFlow Editor, which is implemented as a Java dialog containing an equation editor. Closing this dialog with the OK button returns MathML markup to mathml:insert_math, which handles inserting it into the document.

Editing an existing equation is similar. A call to mathml:edit_math is generated, which reads the old markup, passes the MathML to the equation editor, and if the user modifies it, replaces the old markup with the new.

Whenever math markup is inserted or modified, MathFlow generates a preview image for the equation for display in the Arbortext editing window. This is accomplished by declaring the m:math elements as a "graphic" element in the DCF file for a doctype (custom/doctype/axdocbook_math/axdocbook_math.dcf in the sample doctype,) or by adding the MathML formatting module (located in <custom>/stylermodules/mathml.style) to your Styler stylesheet. Whenever Arbortext renders a document with graphics elements, callbacks are generated to the registered ACL "graphicpathhook" function for that element. The ACL function mathml:mathml_render is the MathFlow "graphicpathhook" function for m:math elements.

The mathml:mathml_render function generates a preview image when needed, and sets 'altimg' and 'baseline' attributes on the m:math element specifying the image file name and the vertical offset of the equation baseline. The mathml:mathml_render function determines when a preview is needed by looking to see if the 'altimg' attribute of the m:math tag is set to "x-ati:DesignScience". This special value signals a new image is needed. A few lines in the screen stylesheet (axdocbook_math.fos in the sample doctype) instruct Arbortext how to use the 'altimg' and 'baseline' attributes to properly display the preview image on screen.

When a document is opened in Arbortext, initial preview images must also be generated. To accomplish this, MathFlow registers the ACL mathml::doc_resolve function as a callback. It finds all the m:math tags and sets the altimg attribute to the special value "x-ati:DesignScience". This in turn generates a graphicpathhook callback to mathml::mathml_render, which, upon seeing the special value, generates a preview file, and updates the 'baseline' attribute and 'altimg' attributes.

The MathFlow ACL code also handles a good deal of auxiliary bookkeeping we won't go into here. For example, when a document is saved to disk, the 'altimg' and 'baseline' attributes must be removed from the MathML markup. Similarly, preview images are deleted at the end of an Arbortext session. Drag and drop, and cut and paste also require special handling. The motivated programmer can ferret out the details by studying custom/init/mathml_init.acl in conjunction with the Arbortext Programmer's Guide.

Callbacks

The files custom/editinit/mathml_editinit.acl and custom/init/mathml_init.acl are where all the callbacks are set for the various events MathFlow handles, via the ACL function doc_add_callback.

The two primary callbacks are create and graphicpathhook:

  • create: calls on_open, which initializes everything once, and then calls doc_resolve when reading a new document, to generate the equation images.
  • graphicpathhook: calls render_mathml whenever a 'graphic' tag is found (remember m:math tags are being treated as graphics elements).

If you are using your own functions as callbacks or hooks, it is possible they are in conflict with the ones defined in MathFlow.

The following list includes all those MathFlow uses.

Callbacks:

  • 'destroy'
  • 'paste'
  • 'insert_tag'
  • 'save'
  • 'saveas'
  • 'cut'
  • 'quit' (in mathml_init.acl)

Hooks:

  • menuloadhook (in mathml_init.acl)

Composition architecture

Arbortext uses SAX pipelines for some kinds of composition, while others are accomplished via a combination of native code and ACL code. MathFlow Composer hooks into both kinds of composition processes to add math functionality.

For SAX-based composition (primarily for XSL-FO, PDF or HTML output generated by XSL stylesheets) MathFlow defines two new SAX pipeline filters, MathMLFilter and MathPageFilter. The latter is for producing the different types of HTML output (when choosing "Save as HTML" or "Compose -> HTML file"); the former mainly for PDF or screen (Print Preview, etc) output. These filters are integrated into the pipelines defined in pdf.ccf, htmlfile.ccf, xsl.ccf, and xsl-so.ccf. These files are located in the custom/composer directory.

The MathMLFilter leaves the rest of the process as is for axdocbook; it just listens for math tags, creates the appropriate images, and passes the rest through untouched. The MathPageFilter has to also write some code in the head of the document (depending on the type of HTML output chosen).

For output generated using FOSI stylesheets, composition is accomplished via native and ACL code. MathFlow provides modified versions of the standard Arbortext ACL scripts that properly process embedded MathML markup. These scripts can be found in the custom/scripts directory.

Import/Export architecture

MathFlow extends the functionality of Arbortext Import/Export to facilitate importing Microsoft Word documents with embedded MathType equations. MathFlow provides Word macros and script code to pre-convert MathType equations in Word documents to MathML. The resulting MathML markup is marked with special Word styles which can then be mapped to the appropriate elements for wrapping equation markup in the converted output. The pre-converted MathML markup itself is passed through the conversion process unchanged. Once the appropriate mappings are set up, the conversion is handled transparently for the user, who only has to choose Import a Document... from Arbortext's menus and select the desired Word file.

In earlier versions of Arbortext Editor, import of Word documents was provided by a separate Arbortext Interchange product. While the details of mapping styles to elements differ in Interchange, the basic architecture is the same.

This information is on a separate page.

Here are some of the most important MathFlow files and their contents:

init/mathml_init.acl:

This file holds the definition of the global variables, customization of menus for the doctype and most of the used ACL code. Below are some of the variables :

  • global img_attr (= "altimg"): the name of the <math> attribute that will hold the name of the temporary preview image file to display. This can be changed if it is necessary to use a different name. See Configuring the Preview Image File Name Attribute for details.
  • global graphic_uri (= "x-ati:DesignScience"): value of "altimg" (or whatever name is specified by the img_attr variable) attribute to signal to the graphicpathhook that it should generate an image for the tag. This should not be changed.
  • global mathml_doctypes[]: an array containing the names of the doctypes all this will apply to (the mathml-aware doctypes). Initially, this only contains one entry (mathml_doctypes[1] = "axdocbook_math"). Add similar entries for each new mathml-aware doctype you are creating, in a file scripts/mathml_user.acl.
  • global display_parents[]: an array with the names of elements that can contain MathML markup in display style. Initially this has only the value "equation". Add your own display element names, in a file scripts/mathml_user.acl.
  • global inline_parents[]: an array with the names of elements that can contain MathML markup in inline style. Initially this has only the value "inlineequation". Add your own inline element names, in a file scripts/mathml_user.acl.
  • global paralike_parents[]: an array with the names of elements that, if containing only MathML markup, will make it be interpreted as in display style. In the sample doctype, it is "para", so that MathML markup all by itself in a para element will be shown in display style. Add your own para-like element names, in a file scripts/mathml_user.acl.

Some of the methods include the following:

  • init_menus: adds entries, shortcuts and modified toolbar buttons for inserting, editing, etc. MathML tags.
  • is_math_display: determines if a formula should be shown in display or inline style. The default behavior is to use the display_parents, inline_parents and paralike_parents arrays, so that customizing can be done by just adding entries into these arrays; but if your doctype has a more complicated structure for display vs. inline formulas, this is the method that you will need to modify.
  • doc_resolve: scans a doc for m:math tags that do not have their altimg attribute set, and sets it to the value $graphic_uri (to alert mathml_render).
  • mathml_render: is the graphicpathhook (see the notes for editinit/mathml_editinit.acl below.) If the graphic element is a math element it generates the image and sets altimg (or whatever name is specified by the img_attr variable) to its filename.

scripts/mathml_user.acl:

This file should contain your own entries for MathML-aware doctypes, and display or inline style parents. See notes for init/mathml_init.acl above.

editinit/mathml_editinit.acl:

This is where all the callbacks are set for the various events that MathFlow handles, via the ACL function doc_add_callback.

init/ipmt5.acl:

This file includes the definition of pre- and post-import hooks for use with MathFlow Import/Export, when importing a Microsoft Word document containing MathType equations.

Certain MathFlow settings can be configured by adding any of the following lines in any order to <custom-path>/scripts/mathml_user.acl (create it if it you haven't done so already).

Note: In a string, a backslash character needs to be escaped by preceding it with another backslash. This shows up more frequently in Windows file paths. To indicate in a string the directory C:\tmp, one needs to use C:\\tmp. Alternatively, one can specify paths using the forward slash — C:/tmp — as we've done in the examples below.
  • To use Direct PDF for Arbortext's Print Composer (instead of Adobe Distiller): $mathml::prefer_distiller = 0;
  • By default the license file should be placed at <custom-path>/lib/mathflow/dessci.lic. To choose a different location for the license file, use the path of your choice in the following setting: $mathml::licfile = "C:/licenses/dessci.lic";
  • To set the location of a failover license file, set its path with the following configuration option: $mathml::failover_licfile = "D:/licenses/dessci.lic";
  • To change the location where the preview images are temporarily stored, use the path of your choice in the following setting: $mathml::tmpdir = "C:/tmp/";
  • To change the location the preference file is written to and read from, set the new location with the following option: $mathml::prefdir="C:/mathflow/";
  • To change the location of MathFlow.dll: $mathml::java_lib_path="C:/mathflow/MathFlow.dll";
  • To increase the size of the equation images created by MathFlow Composer for PDF output, use the following setting (175 would increase the size of the images by 75%; this number can be modified to your preference): $mathml::zoomPercentage = 175;
  • To open an XML file without MathML preview images being generated: $mathml::start_with_imgs = 0;
  • To set MathFlow's HTML output when using Arbortext Publishing Engine's Web/Wireless Composer (details on the differences can be found in the section Save As HTML): $mathml::e3HTMLOutput = "images"; # or "mathplayer" , or "umss".
  • MathFlow has two rendering engines that can be used to generate the preview images on Windows. By default, MathFlow uses the native code engine, which is faster than the Java rendering engine. However, since the MathFlow Editor uses the Java rendering engine, differences may be discovered between editing the equation and viewing the preview image. To eliminate these differences: $mathml::use_java_rendering = 1;
  • To specify the location of a user-defined operator dictionary file: $mathml::op_dict_file = "path";
  • To suppress most MathType error dialogs during Import/Export (only available in MathType 6.9 or later): $mathml::set_quiet_conversion = 1;

Overview, font style management

MathFlow allows users to style characters and expressions by requesting particular font faces, weights and slants. While users can specify any font, weight or slant in the underlying MathML markup, what MathFlow actually renders is limited by the availability of fonts on the system. On Windows systems, MathFlow uses two rendering engines, a Java-based engine for the Editor, and a Win32-based engine for preview images and PDF composition. The font and style behavior of both engines is similar, but there are some differences in details as described below.

There are three cases that must be distinguished to understand how MathFlow selects fonts:

Default rendering

When no explicit style information is specified, the MathFlow rendering engines apply internal algorithms to select fonts. The Java engine firsts attempts to locate characters in a set of preferred system fonts: Times New Roman for alphanumeric characters, Euclid Symbol for common symbol characters, and Code 2000 for rarer symbol characters. Note that Euclid Symbol is included with MathFlow, but Code 2000 must be obtained separately. The choice of preferred system fonts is configurable. See Configuring preferred fonts.

When characters cannot be located in MathFlow's preferred system fonts, the Java engine also contains built-in fonts containing around 500 symbol characters and will use these fonts instead. Characters that cannot be located either in preferred system fonts or internal fonts render as missing glyphs, generally represented by an empty box. In these cases, explicitly specifying a system font known to contain the character is generally the best solution.

The Win32 rendering engine employs a similar, but more complex model. Characters are first sought in a collection of preferred system fonts, primarily Times New Roman, Symbol and MT Extra. For certain classes of characters, Mathematica fonts, Euclid fonts, and/or Adobe's Mathematical Pi fonts may also be utilized. When characters cannot be located in preferred fonts, the rendering engine searches all available system fonts. At this point, there is no control over the order in which fonts are searched, which can lead to unpredictable results. As in the Java case, explicitly specifying a system font known to contain the character is generally the best solution.

Pre-defined MathML styles

MathML contains twelve pre-defined logical styles, such as "sans-serif" and "script." When one of these styles is selected, both the Java and Win32 engines reformulate these requests in terms of specific fonts. As above, the renderers first look for characters in preferred fonts, and then fall back to either internal fonts or other system fonts, depending on the rendering engine.

User-defined custom styles

For user-defined custom styles, both engines will attempt to literally honor the request. If the specified font is not available, an attempt is made to render the character in some font containing the character. If no font can be found, the character is rendered with a missing glyph indicator.

Styles in MathML markup

MathFlow enables the use of user-defined custom styles definitions as described in Customizing font styles. Custom styles are represented in the MathML markup by attributes specifying style names. Here we give more details about how the association between the style name and the actual font being used is managed.

All character data in a MathML expression must be contained in a token element (MI, MO, MN, MTEXT, etc.) The font used to render the character data is determined by the values of the "mathvariant", "class", "fontfamily", "fontweight" and "fontslant" attributes. The last three attributes are deprecated by MathML 3.0 and are not generated by MathFlow (though they are still honored by the MathFlow rendering engine.) The values for any of these attributes may either be set explicitly on token elements, or inherited from a parent element, usually an MSTYLE element.

When several of these attributes are used, MathFlow uses the following precedences in determining which of them controls the font:

  1. Explicit "mathvariant" values
  2. Inherited "mathvariant" values
  3. Explicit "class" values
  4. Inherited "class" values
  5. Explicit "fontfamily" "fontweight" and "fontslant" values
  6. Inherited "fontfamily" "fontweight" and "fontslant" values

Both the "mathvariant" and "class" values are interpreted as specifying all three components of the fonts specification -- family, weight and slant. Thus, the "mathvariant" value "bold" actually specifies a serif font face, in bold weight, with normal (upright) slant.

To illustrate the above precedence rules, consider and example. If we define the style "myArial" to have family="Arial", weight="normal" and slant="automatic", then the following will display the x in an Arial, normal, upright font, and the y in serif, bold, upright (since mathvariant="bold" takes precedence over fontstyle="italic"):

<math>
  <mstyle mathvariant="bold" fontstyle="italic">
    <mi class="myArial">x</mi><mo>+</mo><mi>y</mi>
  </mstyle>
</math>

Whereas this will display the x in a serif, bold, upright font, and the y in Arial, normal, italic:

<math>
  <mstyle class="myArial" fontstyle="italic">
    <mi mathvariant="bold" >x</mi><mo>+</mo><mi>y</mi>
  </mstyle>
</math>

Custom font styles

MathFlow stores the custom font styles you have defined in a file, so that it can be reused from session to session. This file can also be edited directly as a more convenient way of defining a large number of styles, to be shared among many users.

The file in question is named .mfstyles and is stored in the same directory as your MathFlow preferences file, usually the following on Windows systems: C:\Users\<username>.

Font style language

The font style definition file has an XML syntax with seven elements:

<styles>

The root element that can contain any number of <style> elements.

<style>

Each <style> element must contain one <label> element and one <fontspec> element. It is an error if either of these is missing or appears more than once.

<label>

The name of the style.

If the inner text of this element matches the value of the class attribute in a MathML element, then that MathML element will be stylized according to the <fontspec> element that is a sibling of this <label>.

<fontspec>

This can contain 0-3 of the following three elements which describe the font characteristics of the style. If any of these elements are omitted, the corresponding font characteristic adopts a default value listed below.

  • <fontfamily> - Accepted values are any legal font family name, or "inherited". Default is "inherited".
  • <fontweight> - Accepted values are "normal", "bold" or "automatic". Default is "automatic".
  • <fontslant> - Accepted values are normal, italic or automatic. Default is "automatic".

For example, the following code defines two font styles named my Arial, my Arial2, and myEmphasis:

<styles>
  <style>
    <label>myArial</label>
    <fontspec>
      <fontfamily>Arial</fontfamily>
      <fontweight>normal</fontweight>
      <fontslant>automatic</fontslant>
    </fontspec>
  </style>
  <style>
    <label>myArial2</label>
    <fontspec>
      <fontfamily>Arial</fontfamily>
      <fontweight>bold</fontweight>
      <fontslant>normal</fontslant>
    </fontspec>
  </style>
  <style>
    <label>myEmphasis</label>
    <fontspec>
      <fontfamily>inherited</fontfamily>
      <fontweight>bold</fontweight>
      <fontslant>italic</fontslant>
    </fontspec>
  </style>
</styles>

The MathFlow Java rendering engine first attempts to render characters using a collection of four preferred system fonts. The specific system fonts to be used may be identified through the use of a configuration file as described below. Internally, MathFlow considers all characters as belonging in one of the following four broad classes:

  • ALPHA_NUMERIC
  • GREEK
  • SYMBOL
  • EXTRA SYMBOL

The SYMBOL class corresponds to characters found in the standard Adobe Symbol encoding. EXTRA SYMBOL is a catch-all for everything else. MathFlow associates the following numerical values with the following four classes of characters:

ALPHA_NUMERIC = 100
SYMBOL = 101
EXTRA_SYMBOL = 102
GREEK = 103

The default font associations are as follows:

100=Times New Roman
101=Euclid Symbol
102=Code2000
103=Euclid Symbol

To modify the font associations that will be used by MathFlow Editor, create a file named "FontMapping.opt" and place it in this folder:

Windows: C:\Users\<username>
Mac: your user home directory

Write four lines in the file, one for each character class of the form NNN=<font name>, as shown above. Once you restart MathFlow, these fonts will become the preferred system fonts for character display.

Example

In this example, you need all of the characters that are alpha-numeric to use the system font "Arial." To do this, you will need to change the ALPHA_NUMERIC value (ID is 100 in the FontMapping.opt file) to Arial, rather than Times New Roman as shown above. To accomplish this, use the following FontMapping.opt file:

100=Arial
101=Euclid Symbol
102=Code2000
103=Euclid Symbol

This mapping will cause every character MathFlow identifies as an "alpha numeric" character to be displayed in the specified font. Be aware that if you use the wrong font name in FontMapping.opt, you will not see the correct font. It will use MathFlow's internal font if it can't find the system font listed in the file. It is possible to pick any system font, but be aware that most system fonts do not contain glyphs for non-alphanumeric characters. Euclid Symbol is a high quality font for common symbol characters, and you will generally not need to change it. However, Code 2000 is a lower-quality, shareware font whose main appeal is that it has exceptionally comprehensive coverage of Unicode, and depending on what symbols you need, you may wish to replace it with a higher quality font. The "GREEK" font defaults to Euclid symbol, since it has high-quality Greek characters, but high-quality fonts with Greek characters in other styles are not uncommon. In general, you may need to experiment to make sure that the system fonts that you pick contain the proper glyph for the Unicode characters you plan to use.

Overview, MathFlow Composer (EPS)

MathFlow Composer works together with the Arbortext Publishing Engine to produce PDF output for document types containing mathematics. MathFlow Composer produces encapsulated PostScript (EPS) graphics files for each mathematics expression, which are then incorporated into the final PDF output by Arbortext Publishing Engine.

The use of EPS is the key to producing high-quality output since it is a vector graphics format scaling nicely to the very high resolutions required for print output. However, a negative of the EPS format is it's an older graphics format designed for use primarily with PostScript fonts. While EPS technology has improved to allow the use of TrueType and more recently OpenType fonts, not all TrueType and OpenType fonts can be used natively within EPS files.

Both the TrueType and OpenType standards provide a number of alternative ways for accessing the characters within fonts and optional mechanisms for discovering what characters a font contains. Thus, there is wide variation between "TrueType" fonts as to what information is available from the font itself, and how low-level software must access the characters within that font. In particular, only certain types of OpenType and TrueType fonts contain the necessary information in the appropriate format for direct use within EPS files.

Even with PostScript fonts, problems can arise within EPS files. In general, PostScript fonts do not contain enough information within them for external software to automatically, definitively determine what characters are contained within the font. This is a particular problem with mathematical symbols since they are much less standard than alphabetic characters.

Some, though not all, problems with fonts in EPS can be addressed via a complex technique known as "dynamic font subsetting". This technique involves dynamically generating PostScript fonts containing the characters actually appearing in an EPS file from the regular fonts installed on the system, and embedding them within the EPS file. While MathFlow Composer does not yet have this capability, Design Science is working toward this goal for future versions.

Consequently, font management is a significant issue for MathFlow Composer. Care must be taken to ensure MathFlow Composer is configured in such a way characters are taken only from EPS compatible fonts, and MathFlow Composer has access to accurate information about what characters are available to it within EPS compatible fonts. Furthermore, it is desirable for MathFlow Editor and MathFlow Composer to be in sync, so the screen display of characters reflects the PDF rendering with reasonable accuracy, and users of MathFlow Editor are protected from inadvertently styling equations with fonts that cannot be used in EPS.

MathFlow Composer introduces two powerful new font configuration facilities that largely achieve these goals:

  • A mechanism for configuring the exact list of fonts which MathFlow Composer is allowed to use, in order of preference, and
  • A mechanism for extending MathFlow Composer's knowledge of what characters are available in particular fonts.

Taken together with the existing mechanisms for configuring the preferred fonts for use within MathFlow Editor, and the mechanism for defining the list of font styles which can be applied to equations within MathFlow Editor, it is now possible to ensure consistent use of fonts and characters from editing through composition to PDF.

Font configuration

Configuration file

To configure the list of fonts MathFlow Composer is allowed to use, create a font configuration file with file name "FontConfig.txt". Save the file to the <custom-path>/lib/dll/ folder.

If this file is not present when MathFlow Composer is initialized, MathFlow Composer will consider all TrueType, OpenType and PostScript fonts installed on the system as eligible for use within EPS files. This is not recommended, except for compatibility with older versions of MathFlow. See Font style management overview for more information.

The font configuration file should contain a list of font face names, one per line. The '#' character can be used to include comments. Any text on a line following a '#' character will be ignored.

The following example illustrates the format of the font configuration file:

# A sample FontConfig file
#
Times New Roman
Euclid Symbol
MT Extra
Mathematica7

Important: the font face name given in the font configuration file must exactly match the font face name as defined in the font. To determine the font face name under Windows, locate the font in the Fonts control panel and double click it to open the font preview panel. The font face name will be given on the first line. It does not include the parenthesized font technology label which is sometimes present. For example, the title bar of the font preview panel for the Mathematica7 Regular font reads "Mathematica7 (TrueType)" and the font face name is Mathematica7.

Required entries

MathFlow Composer requires access to three special fonts to function properly: a main font, a symbol font, and an "extra" font. The main font must contain the ISO Latin 1 alphabetic characters and is the default font for Latin 1 variables and numbers when no explicit style is given in the MathML markup for an expression. Currently, MathFlow Composer requires Times New Roman to be available as a default choice of main font, and thus Times New Roman must always be listed in any font configuration file.

The symbol font should be a font with the Adobe symbol encoding, and glyphs for most characters in the encoding. There are two main choices, the standard Adobe Symbol font and the Euclid Symbol font installed with MathFlow. Either can be used, but one must be listed in font configuration files. If both are listed, MathFlow Composer will designate the first one encountered in the listed fonts as the symbol font, and treat the other one as an optional entry.

The 'extra' font is used for pieces of brackets, braces, parentheses, and so on, assembled from pieces when required in large sizes. Because the notion of an "extra" font is particular to Design Science products, there are again only two choices, MT Extra, and Euclid Extra. Either font is acceptable, but one must be listed in all font configuration files. As with the symbol font, if both fonts are listed, the first one encountered will be designated as the 'extra' font, and the other will be treated as an optional entry.

Optional entries

Apart from the required font entries, the font configuration file may list any number of additional fonts. When MathFlow Composer encounters a character without a style indicating an explicit font face, it will first try to locate the character in one of the required fonts. Failing that, it will proceed through the list of optional font entries in the order in which they occur, trying to locate a font containing the character. The first font containing the character will be used. If no font is located containing the character, MathFlow Composer will render a red question mark in place of the character.

When MathFlow Composer requires a character explicitly styled with a particular font face, it will consult the listed fonts, and if that font face is present, it will check if the font contains the character, and if it does, it will be used. Otherwise, the font face request will be ignored and the procedure used for unstyled characters will be employed, as described in the preceding paragraph.

Good and bad fonts

MathFlow Composer contains a built-in database of characters useful for math typesetting and fonts that contain them. The fonts and font families are in the database and are known to work well in EPS:

Euclid fonts
Mathematical Pi fonts
Mathematica fonts
Math fonts
Lucida New Math fonts
Lucida Bright Math fonts
Lucida Sans Unicode
WP Math fonts
cm (TeX) fonts
msam and msbm (AMS fonts)
Fences

In addition, all PostScript Type 1 fonts and most OpenType versions of PostScript fonts can be used. For these fonts, however, you may need to provide MathFlow Composer with detailed information about the encoding and characters available in these fonts in a FontInfo.ini file. See The FontInfo.ini file for more information.

By contrast, TrueType fonts of the variety known as CID fonts will not work with the current version of MathFlow Composer. These include most large fonts for Asian languages. The most significant example of a CID font that cannot be used is the shareware Code2000 font.

In general, the easiest way to determine whether a TrueType font is compatible with MathFlow Composer is to simply conduct an experiment. List it in your font configuration file, create an equation with a custom style explicitly setting it as the font, and try it. If it works for any characters, the font is compatible. If MathFlow Composer doesn't seem to be able to find characters known to be present in the font, but it can find some characters, use a FontInfo.ini file to provide detailed information about the encoding and characters available. See The FontInfo.ini file for more information.

Font knowledge

Extending font knowledge

In some cases it may be necessary to add to the font knowledge of MathFlow. MathFlow can use a configuration text file called FontInfo.ini. By placing this file into the same directory as the DLL, it will automatically try to use this file when the DLL is loaded. This section is dedicated to explaining how to use this feature. Using the techniques shown here, a user can assign a character set (encoding) to a font, create a new encoding, and define the PostScript font name to be used in EPS file generation.

MathFlow contains knowledge of the fonts and characters it works with, which results in improved formatting and translation into EPS. Most of this knowledge is in the form of tables built into the code. However, this information can be extended via the FontInfo.ini file external to the program, allowing it to be expanded and corrected without having to change the application itself.

Encodings

An encoding is a one-to-one correspondence between character meanings and integers. For example, ASCII is an encoding mapping characters onto numbers between 0 and 127 and "a" is assigned the number 97. Fonts are said to use or have a specific encoding. The font's encoding determines what character gets displayed when we pass a given number to the operating system. Note character style and shape plays no part in the encoding concept --- A Times-Roman "a" has the same value as a Bookman-Italic "a" (assuming the fonts use the same encoding). A code-point is a particular value in an encoding. For example, "a" has the code-point 97 in the ASCII encoding.

The MTCode encoding

Central to our font information is the MTCode encoding. MTCode assigns a 16-bit constant to every different character our software works with. It is superset of Unicode, a standard encoding attempting to assign a unique number with each of the characters used in the world's languages. Unicode covers a lot of math, but not all the math characters we need. For this reason, MTCode uses Unicode's Private Use Area (PUA), a range of 6400 code points (0xE000 to 0xF8FF) for its additional math characters. We use MTCode values as the key to all of the per-character information --- human-readable character descriptions, token types (variable, operator, etc.).

For more information on Unicode, see the Unicode Consortium. To find out about how we use the Unicode's Private Use Area, see the MTCode Encoding Tables.

Font encodings

Every font is the expression of some character set. In fact, many fonts share the same character set. We use the term "font encoding" to represent a character set that might be shared by one or more fonts. Many applications (e.g. word processors) don't have to know a font's encoding --- the user hits a key, a code is sent to the application, the code is sent back to the operating system to select a character from a font for display. Our software needs to know more.

A font encoding can be thought of as a table with two columns, the position within the font (a numerical index) and an MTCode code point value (the number from our own master character list, MTCode, uniquely identifying the character). We give each font encoding a name (e.g. WindowsANSI, MacStd, Symbol). Many encodings are named after the single font whose encoding it is. Many fonts share the same encoding. For example, standard ISO Latin-1 fonts on Windows all have the WindowsANSI encoding.

Our software represents every encoding other than MTCode as a mapping onto MTCode. That is, for every code-point in a given encoding it indicates a unique MTCode code-point. Using this mapping, we can get at all the per-character information for any code-point in the encoding and, therefore, for any character in fonts with that encoding. For this reason, knowing a font's encoding is very important.

Unfortunately, the computer's operating system tells us very little about the encodings of fonts, least of all for those containing math symbols. So, we have to keep our own knowledge of which encoding each font has. Of course, as people can create their own fonts, the set of font encodings is open-ended.

Extension scenarios

Font information may be extended in the following ways:

  • Define that a font has a given existing encoding.
  • Define (or override) the PostScript font name of a font.
  • Define a new encoding and specify certain font(s) have that encoding.
  • Define new MTCode values and their attributes (description, default style, token type) and use them in a new encoding.

Determining what MathFlow knows

Since the font information extension mechanism we are talking about also applies to MathType, it can be very useful to do the setup with MathType first to verify the proper syntax has been used in the configuration file. When the file is finished, you can then copy the file and place it in the same directory as the DLL.

If you have a font installed on your computer our software does not seem to know anything about, the first thing to do is to verify this is the case. The easiest way to do this is via MathType's Insert Symbol dialog. Follow these steps:

  1. Choose Insert Symbol from the Edit menu.
  2. Choose Font in the View by menu.
  3. Choose the font in question from the font menu just to the right of the View by menu.
  4. Look at the Encoding name displayed directly under the character grid.

If the encoding name is "Unknown", it means our database of fonts and characters has no information for that font.

Assigning an encoding

Defining the encoding for a font falls into two cases:

  • The font's character set matches that of another font for which the software already knows the encoding. In this case, all you need to do is assign the same encoding to your font. This is described in the rest of this section.
  • The font's character set is completely unique. In this case, you will have to create a new encoding first, and then assign it to your font. See Creating a New Encoding.

Once you have decided to assign an encoding to a font, the next step is to determine if its character set matches that of a font for which our software already has an encoding. This is easy if you designed your own font as a perfect substitute for another font that our software already knows about. For example, if you created your own version of the Symbol font, its encoding would be the same as the Symbol font's — "Symbol". If you aren't in this lucky situation, you'll have to work a bit harder. There are a couple of ways to do this:

If you have access to the Internet, you can view our font encoding tables and try to find an exact match in terms of characters and their positions in the font. It would help if you display the font in question in MathType's Insert Symbol dialog, as described in the previous section.

Display the font in question in Window's Character Map utility. Use MathFlow's Insert Symbol dialog to view other math fonts on your system, looking for an exact match in terms of characters and their positions in the font. If you find a match, write down the encoding name for the font.

If you found an encoding that matches your new font, you have to tell MathFlow about it by adding to the Fonts section of FontInfo.ini. See The FontInfo.ini file and Font Sections for details. If you assign an encoding to a font, you should consider letting the Design Science tech support department know so we can add it to the built-in font knowledge of the next version of our software. Just send an email to support@wiris.com and mention the font name and the encoding name (please be precise). If you can, send us a copy of the font and any other information associated with it. If it is a commercially available font, let us know who makes it.

Defining PostScript name

As stated earlier, our software produces Encapsulated PostScript (EPS) files. These must refer to fonts using the PostScript names of fonts, not their operating system name (the font names listed in MathType and other applications' dialogs and menus. So, for example, "Times New Roman" must be referred to as "Times-Roman" in an EPS file. Unfortunately, this is another piece of information that the operating system doesn't give up easily.

Our software can generally communicate with the operating system to get the PostScript name for any PostScript font. Names obtained this way are the correct ones for use in an EPS file. For TrueType fonts used both for the screen and printing, our software can often, but not always, obtain the PostScript name from the TrueType font. In order to handle exception cases, however, you can set the PostScript name for a font by adding to the Fonts section of FontInfo.ini. See The FontInfo.ini file and Font sections for details.

Creating a new encoding

Creating a new font encoding is easy, but a bit tedious. Font encodings are defined using a text file that is placed in the same directory as the MathFlow DLL. For an example of what this file should look like, see Font encoding example.

The filename should follow these rules:

  • It must be unique within the directory that contains the MathFlow DLL
  • It should end with an .enc extension
  • It should be indicative of the name of the encoding (or identical to it).
  • The first line of the encoding file defines the name of the encoding. Here is an example:
    FontEncoding, 1.0, Byte, Symbol

Your encoding file's first line must start with exactly "FontEncoding, 1.0", which identifies this file as a font encoding file whose version number is 1.0. The rest of the line consists of either Byte or Word. This means that the MTCode value will be either two or four hexadecimal characters. For example, if you use Byte, then a MTCode value is expressed as "AE" (without the quotes). If you use Word, then that same number must be expressed as "00AE". The last part of this line is the name of your encoding. So for example, instead of "Symbol" you should use the name of your encoding (which is also what the name of the file should be, followed by .enc). The name of an encoding should be alphanumeric characters without any spaces or punctuation and starting with a letter (e.g. DatapageMath3).

Any blank lines are ignored. Any line that starts with # is a comment. For example:

# Purpose: Symbol font encoding

The rest of the file must contain lines that define the characters in the encoding. For example,

28,226E,NOT LESS-THAN

The three fields are (from left to right):

  • The position within the font as a hexadecimal value. This is shown in the MathType Insert Symbol dialog for the selected character in the Font position readout to the right of the character grid.
  • The MTCode (Unicode) value that uniquely defines a character. You can get this by using MathType Insert Symbol dialog to find the character in another font, then reading the value in the Unicode readout to the right of the character grid. Alternatively, you can find it in either MTCode Encoding Tables or the Font Encoding Tables.
  • A human-readable character description. This does not define the character's description but it must match the description in the MTCode tables. This redundant information helps avoid many errors.
Note: These lines must be in order by the position in the font (the first field).

Adding new characters

Although we have attempted to put every character into the MTCode encoding that is in common use by mathematicians and scientists, this is a goal that we can approach but never reach. In the process of creating a new encoding, if you come across characters that are not in MTCode (i.e. not in the Unicode specification or in the MTCode Encoding Tables), you have two choices:

  • Define the character in question as "undefined". This means MathFlow will not know the identity of the character. This is acceptable in some situations. To make a character "undefined", you would use a line like this: 35,F700,UNKNOWN CHARACTER
  • Contact us to have the character added to MTCode. To do this, send email to our tech support department at support@wiris.com and provide this information:
    • an example of the character as a GIF file (or fax a good-quality rendition of the character and, if possible, a page of math in which it occurs);
    • what font it occurs in (if possible, send us the font itself);
    • a suggested name for the character;
    • any additional information on how the character is used (e.g. is it a binary operator like '+', etc.).

We will assign a new MTCode value to the character and tell you how to proceed from there.

The FontInfo.ini file

This file will be in Windows initialization file format and consist of:

  • Comment lines with version, author, and copyright info.
  • Multiple [Font<num>] sections, each of which contains information for a specific font, including its encoding and PostScript names.
  • A [Encoding] section that, for a given encoding, specifies the file name of the encoding definition file.
  • An [MTCode] section that, for a given MTCode value, specifies its attributes.

Font sections

For each font for which additional information (i.e. PostScript name and/or encoding) is to be specified, the FontInfo.ini file must include a [Font<num>] section (<num> is simply an integer to make the section names unique; e.g. Font1, Font2). Each such section may contain one or more key/value pairs, of which, only the OSName key is required to identify the font whose attributes are being overridden. Any omitted keys will cause the corresponding value to be determined in the default manner described in the earlier sections of this document.

Name = <operating system font name>

Identifies the font being described by the other keys in the section. The <operating system font name> value is the name of the font as it appears in MathFlow's style dialog. This key is required.

Encoding = <encoding name>

Identifies the encoding of the font. The <encoding name> value must be the name of a built-in encoding or one defined in the [Encoding] section (see below for details).

PSName<num> = <style> , <PostScript name>

Where <num> is simply an integer, to make the key names unique (e.g. PSName1, PSName2). The <style> value must be P for plain, B for bold, I for italic, or BI for bold-italic. The <PostScript name> is the name to be used in the EPS file.

The Encoding section

Contains lines of the form:

<encoding name> = <encoding definition file>

where <encoding name> defines an encoding (or overrides a built-in encoding) using the data in the file referred to by <encoding definition file>. This file, whose extension is .enc, must exist and the encoding name in this line must match that stored in the file.

The MTCode section

It is unlikely that you will need to use this section. However the information about this section is provided for completeness. This section contains lines of the form:

<MTCode value in hex> = <token type>,<default style>,<description>

where <token type> is NONE, NUM, VAR, FUNC, OPER, BINOP, RELOP, OPEN, CLOSE, FENCE, PUNCT, INNER, CTRL, or SPACE.

where <default style> is NONE, TEXT, FUNCTION, VARIABLE, LCGREEK, UCGREEK, SYMBOL, VECTOR, NUMBER, USER1, USER2, MTEXTRA, or TEXT_FE. (This is only used by MathFlow.)

where <description> is a human-readable description for a given character.

Operators have some properties that are set in an Operator Dictionary, such as the size of the space to the left or right of the operator, how it changes when in display or inline style, how it stretches, etc. (See MathML specifications.) MathFlow sets these properties to sensible values, but in some instances one may want to fine-tune them.

To do that, one can create a file named mfopdict.txt. In the <custom-path>\scripts\mathml_user.acl file, (which should be created if it does not yet exist) set a variable $mathml::op_dict_file to the location of mfopdict.txt.

$mathml::op_dict_file = "C:\\tmp\\mfopdict.txt";

There is a sample mfopdict.txt file located in the <custom-path>\lib\mathflow\auxiliary directory.

Format of the configurable operator dictionary

The properties (e.g., spacing) may be different according to the position of the operator in the formula; whether it is in the middle (infix), at the beginning (prefix), or the end (postfix) of an expression. To reflect this, mfopdict.txt should have three tables, each one started by a line with its name: INFIX, PREFIX, or POSTFIX. It is also possible to change the default spacing for all operators (only the spacing, not other properties), by writing a line with an operator name of "default". Values that are not explicitly set for an operator, or not in a "default" line for spacing use MathFlow's internal values.

In brief:

  • Empty lines are ignored.
  • Lines starting with // are considered comments and are ignored.
  • Operators can be more than a single character.
  • There are three keywords: INFIX, PREFIX, and POSTFIX to indicate the beginning of each table. Each one should appear in a line by itself.
  • Each meaningful line is a comma-separated list, in the form opname, lspace, rspace [, largeop=true|false, stretchy=true|false, accent=true|false, moveablelimits=true|false, dsi:linebreakop=true|false, dsi:stretchby=segments|scaling, dsi:scaleratiopct=integer].
    • opname is a character, a MathML entity name, or a numerical reference (#xNNNN for hexadecimal, #NNNN for decimal). For example, comma could be represented by the name "comma", the character ",", or the numerical references #x02C or #44.
    • lspace and rspace are numbers+unit, where the optional unit could be 'u' (1/100 em) or 'em'. The default is 'u'.
    • The other properties should come at the end of the line, and are optional. See the table below for further description of each property.
  • Default spacings for each table can be given by using opname='default'
  • Numerical references start with #, e.g. #x3b1 (hexadecimal reference) or #2271 (decimal reference).
  • Do not include spaces at the ends of the lines.

Example:

////////////////////////////////////////////////
INFIX
////////////////////////////////////////////////
default, 17, 17
minus, 33, 33
plus, 33, 33
sum, 33, 33, moveablelimits=false
////////////////////////////////////////////////
PREFIX
////////////////////////////////////////////////
default, 17, 17
plus, 3em, 5
sum, 33, 33, moveablelimits=false
////////////////////////////////////////////////
POSTFIX
////////////////////////////////////////////////
plus, 5u, 3em
sum, 33, 33, moveablelimits=true

This mfopdict.txt file changes the default spacing of all INFIX and PREFIX operators to 0.17em on both sides, but uses MathFlow's default values for postfix operators. The INFIX minus operator has different spacing (0.33em on both sides), and the same is true for the INFIX plus and sum operators. The INFIX sum operator also has the moveablelimits property set to false. In the prefix table, the PREFIX plus gets a larger left space (3em) and a smaller right space (0.05em). Lastly, the sum operator will have moveablelimits when used as a POSTFIX operator.

The img_attr variable is set to the name of the ‹math› attribute that will hold the name of the temporary preview image file to display. The default value is "altimg" but this can be changed if it is necessary to use a different name. For example, the "other" ‹math› attribute could be used instead.

This change needs to be done manually in the following locations:

  • <custom-path>/scripts/mathml_user.acl — Modify this file or create it if it does not yet exist. Add the line $mathml::img_attr = "[name of desired attribute]"; (e.g., $mathml::img_attr = "other";)
  • <custom-path>/doctypes/axdocbook_math.dcf — replace "altimg" at line 210 with the desired attribute name.
  • <custom-path>/stylermodules/mathml.style — (Arbortext 5.2 and 5.3) replace "altimg" at line 645 with the desired attribute name.

mathml.style can also be changed using the Arbortext Styler, though the other files must still be changed manually. Go to the "Styler" menu and choose "Edit Stylesheet". Select the "m:mathml" tag. Go to the "Elements" menu and choose "Style Details." To change the attribute role, highlight the attribute in question, and select the appropriate attribute role. For example, "altimg" would be changed to the role of "Other", while the "other" attribute could be changed to the role of "File name".

The Java Rendering Engine can be used for the Arbortext preview images by adding the following line to <custom-path>/scripts/mathml_user.acl (which should be created if it does not yet exist):

$mathml::use_java_rendering = 1;