Content:

Math in HTML (and CSS)

How to present math­emat­ical ex­pres­sions using a language that has so little markup for them? Web authors often need resort to images, but there are more flexible approaches, like MathJax. Moreover, if you need just some special symbols or simple ex­pressions, a lot can be done in HTML, assisted with style sheets (CSS). This document mainly discusses relatively simple mathematical expressions rendered one-dimensionally (inline), though possibly with superscripts or subscripts.

The word “mathematical” is used in a rather broad sense here, covering different for­mal­isms and symbols, including the symbols of physics, formal logic, etc. You might wish to take a look at Andreas Prilop’s nice document Mathematical formulas in HTML 4.0, which illustrates well what kinds of symbols and expressions are discussed here. And by look­ing at the HTML source there, you can see how such things can be done.

What? No math?

An extract of a Wikipedia article, which uses an image for two formulas. Note how the appearance of the letter phi (φ) is very different from its occurrence in text.

You may have wondered why web sites, such as Wikipedia, use images for presenting mathematical formulas, even in simple cases like ²φ = 0. Partly the reason is simplicity and uniformity: since some formulas need to be presented as images, it may sound simpler to use images for all images—even in cases where simple HTML markup would do fine:
∇² = <i>φ</i>
or
&#x2207;&sup2; = <i>&phi;</i>

Images provide a methodologically uniform approach, but the result is not typographically uniform at all: characters in an image are typically rather different from the same characters in text.

A formula for which no natural HTML markup exists

Although HTML markup exists in the sample case, mathematical formulas are usually much more complicated. HTML lacks markup for mathematical expressions as structures, and there is no simple way to produce anything essentially two-dimen­sion­al beyond superscripting or subscripting.

Given the fact that HTML was originally developed at CERN, the European Laboratory for Particle Physics, it may sound odd that HTML has rather little markup for mathematical expressions and other special notations used in science. It’s not a surprise that you can’t do math in HTML; after all, it’s a markup language, not a programming language. But how are we expected to write physics reports or even modern biology papers without math­emat­ical notations?

There was once an HTML 3.0 draft, with a section titled HTML Math, suggesting relatively simple markup for some basic mathematics. But it’s all history; the draft expired in 1995. (There was also an earlier idea about HTML+, which would have had a different, more natural-looking math syntax.)

The modern trend is to regard things like mathematical markup as special “applications”, for which specialized languages are used; see W3C’s Math Home Page for information about such trends, including the MathML language. If you ask me, the language hopelessly mixes structure and appearance. In any case, support to MathML in web browsers still isn᾿t wide enough to justify its use on web pages in general.

However, this does not mean that it would be impossible to write mathematical documents in HTML. There are difficulties and difficult decisions to be made. For relatively simple math­e­mat­i­cal notations, HTML can be used rather well, especially if you use handle the toughest parts using images with adequate alt texts. See, for example, Stan Brown’s articles on math. Note that there’s the possibility of making an article available both in HTML format and in some other format, and this might be relatively easy if suitable tools can be found.

Alternative approaches – formats other than HTML

For documents where mathematical notations are needed a lot, other formats than HTML may turn out to be more practical. This might mean for example PDF, PostScript, or TeX format, or perhaps all of them, offered as alternatives.

The benefits of HTML in Web authoring are considerable, however, so it can be a tough decision. This document tries to help in making an informed decision as well as in implementing the HTML way, if that way is taken.

Even if you decide to publish your document in, say, PDF format only, you might still considering making its abstract available in HTML format too. Typically an abstract can be written with just a modest amount of mathematical notations.

There are also JavaScript-based approaches that use libraries for converting TeX notations or TeX-like notations into graphic format. They typically fall back to displaying such a notation, like \[P(E) = {n \choose k} p^k (1-p)^{ n-k} \]. The best-known implementation of the approach is the MathJax library, but there is also a considerably smaller and faster library, jqMath, which can produce rather amazing results.

For example, using MathJax, the code $$\sum_{k=1}^Nk(N-k+1)$$ causes the following formula to be displayed:

$$\sum_{k=1}^Nk(N-k+1)$$

The approach is simple, from an author’s point of view: you just insert a fixed script element in your document, and you put an inline formula between \( and \), a display formula between $$ pairs. In the formula, you use for example TeX (specifically, AMSTeX) notations, which is rather simple to learn for basic mathematical constructs.

Avoiding the problem: simple linearized notation

Sometimes it is best to present mathematical expressions in linearized notation. For example, instead of trying to find a way of presenting the square root of 2 in the normal mathematical way, you might write just sqrt(2). For quotients, you’d use notations like (a + b)/(ab). Instead of exponents you might use notations like a**b or a^b (a to the power b). It depends on intended audience whether you need to explain such notations, but normally you should explain any conventions that are not part of normal school math­e­mat­ics.

Note that even in plain text, you can use the multiplication symbol × and do not need to resort to the asterisk (*) of computer jargon. See also other notes on using “normal” characters.

To indicate nesting of mathematical expressions, linearized notation uses parentheses. It has been common to use different types of parentheses: innermost, you use normal pa­ren­the­ses (); next level, you use square brackets []; and outermost you use curly braces {}, e.g. [(x + y)^(1/3)]/z. However, according to the standard on mathematical notations, ISO 80000-2, normal parentheses should be used, because other parentheses have special meanings. Example:
((x + y)^(1/3))/z.

A document containing such an expression should probably contain, near the beginning, a statement like “In this document, the circumflex character ^ is used to indicate exponentiation, i.e. raising to a power.”

Such an approach makes the document accessible to virtually all who have the necessary mathematical prerequisites, if you limit the character repertoire to ISO Latin 1. But naturally it’s a rather simplistic method. You might still consider using it as an alternative format, made available along with some more advanced but less accessible format(s), such as an image.

Formulas as images

Generating images from formulas

You can use some software, e.g. something TeX based (see also AMS TeX resources) or a mathematics program with graphic output format option, e.g. Mathematica, to produce visual representations of formulas as images. Often such software gives the result in PostScript format, but for Web use a GIF format is usually better, since it is much more widely sup­port­ed in browsers by default. Various tools exist for converting from PostScript to GIF format. You could also consider using the Latex2HTML converter which can, among many other things, generate images in GIF format. (There’s another converter, TtH, which uses all kinds of tricks to convert mathematical expressions to HTML without using images. It applies very questionable methods like the Symbol font kludge. You might still get some useful ideas from its output and apply them more reasonably.)

Examples of using images

Such an approach is widely used on the Web. See, for example, the wonderful MathWorld pages. Note, by the way, than when you use mathematical terms, say binomial coefficient, on your page but cannot define all of them, it might be a very good idea to link to definitions and descriptions like those on MathWorld. Even if your terms are already known to 95% of readers, you can give the 5% a chance to find it out, and the others might need to brush up their knowledge too, or just check that the term really means what they think it does. See also Eric Weisstein’s Treasure Troves of Science collection, which is impressive, too, not only in content but also in the Web presentation of mathematical formulas. As regards to mathematics proper, e.g. S.O.S. MATHematics also has nice pages on several topics.

The technique: the img element

You would use the img element to embed an image into your document, and/or an a href element to create link to it. The latter method is often worth considering, especially for large formulas. The reader may prefer reading the text without distractions and looking at the formula (image) at the very moment he is prepared to do so. Moreover, he may prefer looking at it in a separate window (which is separately adjustable in size and positionable on the screen), or perhaps print it out (due to better resolution in print than on screen).

As a very simple example, I have used the simple markup
\int_{0}^{\infty}e^{x^2}dx
in TeX and generated a GIF image file from it, then just embedded the image into an HTML document using the following markup:

<p><strong>Assignment 42</strong>. Compute
<img src="integral.gif" align="middle" alt=
"the integral of exp(x**2) for x from 0 to infinity"></p>

This looks like the following on your browser:

Assignment 42. Compute the integral of exp(x**2) for x from 0 to infinity

And here’s the same with an intentionally broken reference to an image:

Assignment 42. Compute the integral of exp(x**2) for x from 0 to infinity

Generally, the more complicated formulas you need to present, especially if nonlinear in appearance, the more seriously you need to consider using images for the purpose, despite their drawbacks.

In principle, the object element could be used to embed data in different formats, including images, animations, and interactive presentations. However, support for object is limited and buggy. Support for its “brother”, the iframe element, is somewhat less buggy. In special cases, you might consider using iframe for large formulas mainly because iframe lets you specify an area where the data is to be presented so that a browser is expected to introduce scroll bars if needed. You would then need to include the normal img element as the content of iframe, to provide a fallback for browsers that don’t support iframe.

The plain text alternative: alt attribute

If you decide to use an image, you still have the original problem, in a modified form. Due to accessibility considerations, every img element must have an alt attribute that specifies the textual alternative. This means, in practice, that you need to write an alternative presentation of the expression in pure text form, with no HTML markup. On the other hand, you will do this for “fallback” use only, and you can e.g. write rather verbose explanations if needed, since the text will normally not be seen by people who see the image.

The purpose of the alt attribute is make the text comprehensible in a browsing situation where the image is not displayed. For some notes on the importance and practical use of alt texts, see my Guidelines on alt texts in img elements. In the simplest case, you would use a simple linearized notation as alt text.

A particular problem arises when such alt texts contain notational features that are not self-evident and are not otherwise used in the document. For example, suppose that expressions contain exponentiation, which is indicated using the usual two-dimensional formatting (an exponent appears raised, perhaps in smaller font) in image presentations, but a circumflex character ^ is used in alt texts. How can we inform people who need the alt texts, without disturbing people who see the images only? There’s a simple trick: use a “dummy” image, such as single-pixel transparent GIF, and put the explanations into its alt attribute, e.g.
<img src="transp.gif" alt=
"In this document, the circumflex character ^ is used
to indicate exponentiation, i.e. raising to a power.">

And you would put this element somewhere near the start of the document, before the first occurrence of an alt text that uses the convention explained.

If the image has been generated using TeX, you might also provide a link to the TeX version, since some people who cannot see the image for some reason might be able to understand TeX markup, or utilize it indirectly. Perhaps the image itself could be made into a link. In fact the MathWorld and Treasure Troves of Science pages often include the TeX notation directly as the alt text; this is useful but not as useful as giving plain text presentations there and providing links to TeX versions. In the document HTML Techniques for Web Content Ac­ces­si­bil­i­ty Guidelines 1.0, section Markup and style sheets rather than images: The example of math recommends that if an HTML document has been generated from a TeX document, the original TeX (or LaTeX) document too be made available on the Web, since there are possibilities for auditory rendition of such versions.

That document also recommends that if a formula is constructed from several images, a single alt attribute should specify a textual alternative to the entire formula. Apparently, you would put it into one of the images and use alt="" for the other images.

Images in running text

Images are widely used for presenting formulas on the Web, and they are quite often the best practical way for large formulas to be presented as “blocks”. For symbols and short formulas in running text, “inline”, similar approach can be less pleasant, since the image does not adapt to text font size and face. Here’s a simple example:

The Greek letter capital sigma is often used to denote summation, but from the character standards viewpoint, the n-ary summation symbol is a distinct character, though historically derived from the Greek letter.

If you change the basic font size from the browser’s menu, you’ll see how the image remains the same size, instead of being adjusted according to the font size. And attempts to use style sheets to scale images are not always very successful, though browsers tend to scale downwards relatively well these days; you could design the image in large size and then use <img ... height="1em"> to make the browser to scale it to the font size of the text. Moreover, line spacing easily gets disturbed. (See notes on such issues in Guide to using special characters in HTML.)

You could also write e.g. <img src="alpha.gif" alt="&#945;">, using a numeric character reference, and then on some browsers the document would look better with images off than with images on. But why not use just &#945; then?

Using “normal” characters; esp. notations for physical quantities

Within the “normal” set of characters, namely the ISO Latin 1 character repertoire, which you can use pretty safely in HTML authoring, some characters need special attention as regards to use in mathematical notations. (As regards to typing them into your HTML doc­u­ments, see section Typing characters in my character code tutorial.)

On Macintosh platforms, several browsers have had serious problems with some ISO Latin 1 characters, and these characters include the superscripts 1, 2, and 3 and the vulgar fractions ½ ¼ ¾ as well as the multiplication sign ×. It seems that these problems have finally lost significance: any reasonably modern browser on Mac renders these characters correctly.

Among the basic arithmetic operators, the plus sign + poses no problems. As a minus sign, the hyphen-minus character (-) is commonly used, but using the minus sign character would be more logical. It is usually visually better (longer). Nowadays, it seldom causes character problems. It avoids some line breaking problems that browsers have with hyphen-minus: we usually do not want e.g. a unary minus sign to have separated from its operand by a line break. (Browsers often break after hyphen-minus, but not minus sign.) For multiplication, there is seldom any reason to use the asterisk *, e.g. a*b (it’s a programming language idiosyncracy), since you can use the multiplication sign, e.g. a × b or perhaps the middle dot, e.g. a · b. When referring to vector operations, you could use × for a cross product and · for a dot product. However, in principle, by the ISO 80000-2 standard, the correct multiplication dot is not the middle dot (·) but the dot operator (⋅, U+22C5). For fractions and division, the solidus (slash) character / is commonly used, but you might also consider using the division sign ÷ in some cases.

In addition to the above-mentioned use as a multiplication symbol, the asterisk * has several uses as an operator symbol of some kind. Generally such uses are surrogate notations for various star-like symbols with more specific semantics. Since the asterisk is displayed in relatively small size and in a superscript position in most fonts, it does not look good when it should be binary operator. So if you use it that way, in lack of better characters, you might consider using markup that suggests that * be displayed in a monospace font; the simplest method is to use tt markup, e.g. <tt>*</tt>. On the other hand, for use e.g. as a star denoting an adjoint matrix, superscript style is better, and normally achieved (if achievable in a browsing situation) by not using any font markup for it. When using “special characters”, there is, in principle, a rich repertoire of various asterisk- and star-like characters available in Unicode. There is a large number of characters with “asterisk” in their name, including “asterisk operator”. Note that there are several Unicode characters with “star” in their name, including some that are classified as mathematical (general category: Sm, i.e. Symbol, Math), such as “star operator”.

Other operators include the less than and greater than signs (< and >). Since these characters are essential as tag delimiters in HTML, they should be “escaped” using the notations &lt; and &gt; when they are to be included into the document’s textual content, as in a<b. Similarly the ampersand (&), which might be used e.g. in logic as an and operator (connective), should be written as &amp;.

There is nothing special about the equals sign (=); it can be typed as such. But the inequality symbol and the not equal to, less than or equal to, and greater than or equal to symbols are problematic; they do not belong to ISO Latin 1. If you have decided to use “special characters” (i.e. characters outside ISO Latin 1) despite their problems, then you would use the numeric references &#8800; and &#8804; and &#8805; (appearance on your browser: ≠ ≤ ≥). Otherwise you would have to use some surrogates like =/, <= and >=. Some authors use =< and => instead of <= and >=, but there’s anyway the risk that one of the constructs be mistaken as an implication arrow or another arrow symbols. The not equals sign is the most difficult, since it has no widely recognized surrogate. Note that /= is a “divide and assign” operator in some programming languages! In some contexts, you might use the character pair != to denote inequality, especially if you expect it to be familiar to readers from their experience with programming languages like C, Perl, and JavaScript. For a general audience, however, the != symbol could be a mystery; there is hardly anything natural in using the exclamation mark to denote negation!

When expressing physical quantities, note the following rules of the SI system of units (see Guide for the Use of the International System of Units (SI) for more information):

Web browsers generally treat any space character as a possible line break point. This is often undesirable in notations like the one discussed here, e.g. between a numeric value and a unit denotation. Thus, some method of avoiding that should be used, e.g. using a no-break space between them, instead of a normal space. See section Line breaks as problems.

As parentheses you can use normal parentheses (), square brackets [], and braces {}. You might wish to use font markup (such as <big>{</big>) to make outer parentheses look bigger, to make the structure visually clearer.

Also note that ISO Latin 1 includes some other more or less mathematical characters beyond those in Ascii: the micro sign µ and the not sign ¬. But beware of the following: assuming that the sharp s ß would be the letter beta; confusing the ordinal masculine indicator º with the degree sign ° or taking either of them as superscript 0; thinking that left guillemet « and right guillemet » would mean ‘much less than’ and ‘much greater than’; or assuming that the letter O with stroke Ø would be an empty set symbol! Don’t be tempted by casual ap­pear­ances of characters; there are reasons to be strict about the meanings of characters.

Using special characters

Full Unicode in principle

In mathematical notations we very often need special symbols, such as Greek letters or a symbol for infinity. In principle, you can use the full Unicode character reportoire in HTML. Unicode Technical Report #25, Unicode Support for Mathematics, says: “Starting with version 3.2, Unicode includes virtually all of the standard characters used in mathematics.”

Limitations in browsers and fonts

Previously, browser support had serious flaws when it comes to special characters. The most fundamental problems have almost completely disappeared. But the very important practical limitations in fonts exist.

The problems and solutions are discussed in Guide to using special characters in HTML. The document is newer than the presentation in this document and covers some topics not discussed here.

Support to mathematical characters in fonts varies greatly. Some typical examples:

Unicode Status in fonts
± U+00B1 plus-minus sign Practically universal support
U+2264 less-than or equal to Very good support
U+2230 volume integral Limited, but works for most users
𝑎U+1D44E mathematical italic small a rather limited support; use <i>a</i> instead

In practice, all or most of the mathematical symbols you need are covered by the Lucida Sans Unicode font, which is normally present in Windows systems. It is not of particularly good design, so you might suggest Arial Unicode MS (shipped with Microsoft Office) as a preferred alternative. Moreover, you should include a list of fonts that are commonly available on other platforms (Linux etc.).

For pages that make essential use of mathematical characters that are not covered by fonts normally in use on Windows platforms, it might be a good idea to include a short note about fonts. Perhaps with a few selected characters and with an explanation of what they should look like and a remark: “There are free fonts that contain these characters, such as DejaVu Serif, Quivira, and Symbola.”

Methods of entering special characters

You can enter special characters as such (if the character encoding permits that), or using “escape notations”. For example, the Greek letter alpha can be written in HTML source as such (α), as the entity reference &alpha;, as the hexadecimal character reference &#3b1;, or as the decimal character reference &#945;. These methods are equivalent in principle, and they all work on any reasonably modern browser.

HTML 4.0 contains a relatively large set of named entities like &alpha; for various symbols, including Greek letters as used in mathematics. They are nicely presented in WDG’s doc­u­ment HTML 4.0 Entities. (There’s also a more compactly presented list by me: Character entity references in HTML 4.0.) However, they are just “named constants”, with definitions that use numeric character references. Some early browsers (e.g. Netscape 4) supported them but not the named entities. Today, this is no longer relevant, but there’s not much practical reason to use the entities. Although the entity names are intended to be mnemonic, some of them are rather cryptic; using &sum; for the n-ary summation symbol would probably work well (in the above-mentioned sense), but how many readers would intuitively understand what &ni; means? Note: If the entity is followed by a space or a line break, the semi­colon can be omitted, thereby making the notation look slightly more natural. Technically, the semi­colon can be omitted if the character immediately following it is not a letter (A–Z, a–z) or a digit or one of the following: -._:; (hyphen, period, underline, colon, semi­colon).

HTML5 drafts contain a long list of added entities. They are mostly rather cryptic and practically useless, though browser support was introduced to Firefox near the end of year 2011.

Note that by HTML specifications, you are not limited to using those numeric character references for which there is a named entity in HTML 4.0. You can principle use any &#number; notation where number is the Unicode code position to which a character has been assigned.

As a practical point, note that Microsoft has defined WGL4 (or “Pan-European”) list of characters, containing some mathematical symbols too. If you use them only, you have relatively good odds of finding a large number of users equipped with fonts that can display the characters. See Using Special Characters from Windows Glyph List 4 (WGL4) in HTML by Alan Wood.

See also the document How to find an &#number; notation for a character, and note that you will probably be interested especially on the following Unicode blocks:

If you only need Greek letters in addition to Ascii characters, then there is the possibility of using an encoding which lets you enter Greek letters directly. You could use ISO 8859-7, which is relatively widely supported. See section Simpler ways for simpler needs: simple 8-bit encodings in the above-mentioned document. But note this would give you just Greek letters, not e.g. not equals sign, nabla operator, aleph, etc. ad ∞. Moreover, you would risk losing most of the upper half of ISO Latin 1, due to browser bugs.

Pitfalls: relying on appearance

There are pitfalls in the area of characters. Never trust a smiling cat, or what a character looks like. For example, in modern character set standards, the mathematical symbol for n-ary summation (&#8721;, ∑) is distinct from Greek letter capital sigma (&#931;, Σ), although it is historically derived from it. Ditto for n-ary product symbol (&#8719;, ∏) and capital pi (&#928;, Π). I used to think that someone might still consider using sigma as a surrogate, assuming it might be better supported, but Andreas Prilop kindly pointed out my mistake, concluding with the following: “you need only a browser that knows ‘Unicode big numbers expressions’ and Windows 3.1 [or newer] or any MacOS version, both without any Greek support, to display &#8719;”.

Problems of poor rendering

Another pitfall is that character rendering may vary greatly across browsers. For example, the simple rightwards arrow character, which can be denoted e.g. as &rarr; in HTML, has very varying shapes, as you can see by using test page for rightwards arrow. The variation ranges from an almost dash-like symbol (with just a tiny arrowhead) to a grotesque version where the arrowhead dominates (e.g., “ in Calibri). Moreover, some common fonts like Verdana lack it, so in text with Verdana as the primary font, the arrows will appear in whatever font the browser chooses to use as a replacement font. Therefore, it may be a good idea to suggest a specific, widely available font where the appearance is acceptable. You can write e.g.
<span class="arrow">&rarr;</span>
in HTML and
.arrow { font-family: "Times New Roman"; }
in CSS. In Times New Roman, the arrow () corresponds to the arrow symbol traditionally used in mathematics, and it can be used in conjunction with many different fonts.

If the vertical position of the arrow is unsatisfactory, as it may be when used between capital letters of the Arial font (A  B), you can tune the rendering ( B), by adding e.g. the CSS rule
.arrow { position: relative; bottom: 0.07em; }.

The division slash character (U+2215), being defined as a mathematical operator, is in principle more adequate than the common slash character. It belongs to rather many fonts. But the problem is that the appearance in generally unsuitable. Its glyph usually touches the adjacent characters or even strikes through them. Here is an expression using first the common slash, then the division slash:

a/b
ab

Double-struck letters vs. bold letters

Double-struck letters such as ℕ, ℤ, ℚ, ℝ, ℂ are commonly used as symbols of standard sets of numbers in mathematics. Such letters are contained in Unicode, in the Letterlike Symbols block, but their browser support is limited.

It has long been acceptable to use normal letters in bold face instead. This appears to be even the original notation, and it is preferred in the international standard on mathematical notations, ISO 80000-2 (approved in 2009).

Thus, it is better to use just bold face letters such as N. You can achieve this simply by using the b element in HTML, e.g. <b>N</b>.

Similarly, even though Unicode contains characters like “mathematical bold italic small a” (U+1D482), it is much safer to use normal letters and simple markup, such as <b><i>a</i</b>, to produce a symbol like a.

Surrogate notations

This section is not particularly important any more, since most of the special characters discussed here can be used on web pages rather reliably.

For characters that cannot be presented reliably enough but would be needed, different surrogates can be considered. The obvious method is to use words or abbreviations, e.g. “infinity” or “inf” instead of the infinity symbol (∞). In mathematical formulas, words should not be used, so an abbreviation is preferable.

Sometimes one could consider using a character or a sequence of characters in a way which tries to approximate the appearance of a real character which itself cannot be used re­li­ably. There are strong reasons to avoid playing with characters too much, and one should not do things which seriously conflict with the meanings of characters. I’d hesitate using, say, “oo” as a surrogate for the infinity symbol. But there are some notations that one could use, at least if explained in a legend.

In particular, the logical connectives, corresponding to “and” and “or”, could rather safely be presented using /\ and \/, perhaps in reduced font size (/\ and \/). In fact, the reverse solidus \ has no fixed semantics that would be violated by such usage, and in fact it was originally introduced for use in these notations! Note that the Unicode characters for the connectives (∧ and ∨) have limited supported in fonts.

A common surrogate for arrow symbols is the use of character pairs like <- or <=, though longer strings like ---> might make the meaning more obvious. See also notes on problems with some common comparison operators.

Note that some of the surrogate notations discussed here may suffer from line break problems on IE, unless precautions are taken by using nobr markup.

Since notations like /\ or ---> try to imitate the shape of the real characters they stand for, it can be useful to suggest some particular font, or a list of fonts in order of precedence, using the font face markup in HTML or the font-family property in CSS. I have written a test page that shows text in different fonts which are commonly available in Windows systems. For example, --> looks odd when Arial font is used, but --> in Tahoma looks pretty much like an arrow.

In practice, you could use a style sheet like the following:

.logop { font-family: "Times New Roman", serif }
.arrow { font-family: Tahoma, Symbol, monospace }
.darrow { font-family: "Times New Roman", serif }
and HTML markup like the following:
<span class="logop"><nobr>/\<nobr><span>
<span class="arrow"><nobr>--></nobr><span>
<span class="darrow">==><span>

This is how they look like on your current browser: /\ --> ==>

Line breaks as problems

Line breaks are often undesirable inside expressions, but Web browsers generally treat every space as a potential line break position. There are several ways to deal with this:

It is debatable whether it is more logical to use the CSS way than no-break spaces. In a sense, it is a structural property of an expression like “42 m” that its two parts belong closely together, so that in any normal presentation, there should be no line break between them, or any pause in speech presentation. On the practical side, no-break spaces surely work more reliably, partly because browser support for the white-space is still limited, partly due to general CSS caveats.

Note that the nobr markup is the only sure cure against IE’s tendency to treat every hyphen as a potential line break opportunity, even in a string like a-b or -a! But if you have decided to use “special characters”, then you might use the real minus sign, &#8722;, instead of hyphen, and that would avoid the hyphen problem. See Dashes and hyphens.

There are even more oddities in IE: it may also treat any of -()[]{}«»%°·\ as indicating a potential line break position (before or after, depending on the character). Thus, it may well split “f(x)” to “f” on one line and “(x)” on the next! The nobr element is the most effective cure, as explained in Word division in IE, but we will discuss a special case later.

Grouping digits and using “thinner spaces”

The problem with separators

It is customary and recommendable to group digits in long numbers to groups of three digits. But the method of separating the groups depend on cultural conventions and even personal style. This typically means using spaces, commas, periods, or apostrophes as separators.

It is safest to use spaces, since the other alternatives could be misinterpreted. For example, in English 1,005 would mean one thousand and five and 1.005 would mean one and five thousandths; in French, and in several other languages, it’s just the reverse! We need to make some decision concerning the decimal separator, but for integers we can avoid the problem by using spaces: 1 005 is unambiguous. It is true that e.g. in English texts it does not conform to normal English practice, but here we are discussing mathematical texts, where the practice is recommended by standards,

This raises two problems: First, the line breaking problem that we just discussed; but we saw that there are reasonable solutions to it. Second, you might regard spacing between digit groups as typographically excessive, if normal spacing (as between words) is used.

Affecting the width of spacing

Although there are several space characters of specific width in Unicode (in the range U+2000 to U+200B in the General Punctuation block), using them is not a good idea, as a rule. In practice, they don’t work well, partly due to limitations in browser support for such “special characters” (to be discussed in the next section). And in principle, they are “compatibility characters” only.

But you can use normal space characters and specify some simple CSS rules that suggest reduced spacing between “words”, using the word-spacing property with a negative value. That property specifies the spacing to be used in addition to default inter-word spacing, so a negative value suggests a reduction of the spacing. A “word” is here any sequence of non-whitespace characters. The boring part of the matter is that you typically need to include extra span markup just to have some element with which the rule can be associated. You can hopefully find some nice program tool for generating the markup needed, so that you don’t need to type it all by hand.

In my opinion, word-spacing: -0.07em creates a fairly nice result for spacing between digit groups. It means a suggestion to reduce the normal spacing by 7% of the font size, so it naturally adapts to whatever font size happens to be in use. This seems to make the digit groups separated visibly but not disturbingly. The following demonstrates first a long number without the effect of such a suggestion, then with that effect, naturally assuming that your browser supports this part of the CSS specification:

For the latter number, I used the following markup:
<span class="number">123 456 789 000 000 000</span>
and the following style sheet:
.number { word-spacing: -0.07em; white-space: nowrap; }
The white-space rule is unrelated to spacing but a good idea for other reasons, as explained above.

For related notes, discussing similar problems in text processing, see How to cope with international standards for the thousands separator by William S. Statler.

Spacing in expressions

The “excessive spacing” problem also arises in other contexts in mathematical expressions. It is often regarded as good style to use some spacing e.g. around mathematical operators, but not as big spacing as we get in typical browsing situations if we just use normal spaces.

In high-quality typesetting, e.g. when using TEX, spacing is controlled carefully, using advanced tools and techniques. We cannot expect to achieve the same using HTML and CSS, but we can aim at reasonable quality.

An expression like “a + b” is usually best written in HTML so that there are spaces around the operator “+”. This gives more flexibility, since we can then use word-spacing in a simple way. If the spaces were omitted, letter-spacing (which actually affects the spacing between all characters, not just letters) could be used in simple cases like this, but things would get much more complicated when variables consist of several characters (e.g., contain subscripts).

The following example shows the rendering of an expression with several operators, under different style sheets:

unstyled with spaces a + (b × c)
word-spacing:-0.07em a + (b × c)
word-spacing:-0.2em a + (b × c)
unstyled without spaces a+(b×c)

Thus, a word-spacing value like -0.07em creates an appearance that more or less resembles typical typesetting of mathematical expressions. A value of -0.2em tends to reduce spacing so that it almost corresponds to the rendering we would get if no space characters were used. The effects naturally depend on the font in use, but these observations apply to typical fonts like Times New Roman and Arial.

Using fonts

Say No to font kludges

In the old days, one of the most common ways of trying to include Greek letters and mathematical symbols into Web pages was to use font face="Symbol", such as writing <font face="Symbol">c</font> to get the Greek letter gamma (γ). You may still find web pages that propagate such usage as if were clever and useful. I will not explain here why that approach is fundamentally wrong; I refer to the excellent presentations Using FONT FACE to extend repertoire? by Alan J. Flavell and <FONT FACE> considered harmful at the Alis Babel site. Briefly, the trick appears to work in many situations, but that’s because of browser bugs. The invariable meaning of <font face="Symbol">c</font> is the letter c, so any correctly behaving browser will display it using some physical presentation (glyph) for that character.

Useful font markup

There are several reasonable uses for font-level markup that could be applied in mathematical notations. In particular:

Italic may cause spacing problems

Browsers often tend to put characters too close to each other when a character in italics is immediately followed by a non-italics character, as in f(0). This is more or less an inherent problem with fonts rather than a browser issue. In italic, letters are usually slanted, and this often makes a tall letter hit the next character, if it is upright and tall. For example, |a| may look reasonable, but |T| probably looks bad without stylistic tuning.

You might consider using a no-break space character between them, e.g. f (0), using markup like <i>f</i>&nbsp;(0). As a side effect, this trick seems to prevent an undesired line break before “(” on IE. But this is really an ugly trick, and it might result in poor appearance on a browser which behaves more reasonably by default, leaving enough space, so that the trick would make the spacing excessive.

Perhaps the best way to deal with the italics issue is to use CSS to add some empty space after any element rendered in italics. You could use either margin or padding property for this. Using padding is probably better, since some day someone might set a background color for the element. (The background extends to the padding but not to the margin.) If you need the spacing just for an individual expression, you could use markup like <i style="padding-right: 0.2em">f</i>(0) for an expression like f(0). Or you might even write a general style sheet rule that sets a right margin for all inline elements that are commonly rendered in italics. You might explicitly set right margin to zero, since it is imaginable that some browsers deal with the problem by using some default right margin, and you don't want a cumulated effect. Example:
i, em, cite, dfn, var { padding-right: 0.15em; margin-right: 0; }

Prefer serif fonts

Font demo
serif sans-serif
× x × x
U ∪ U ∪
ε ∈ ε ∈
a a a a
o o o o

Fonts with serifs are usually better than sans-serif fonts for mathematical texts. This may sound strange, because mathematical operators, like the multiplication sign ×, typically have no serifs. They have rather similar design in all fonts. But there is usually a considerable difference between serif and sans-serif design for letters.

It is important to distinguish mathematical symbols from each other and from letters and other characters. The serifs and the varying stroke width in serif fonts often help this. Moreover, serif fonts typically make a better distinction between upright and italic style.

This usually rules out Arial, the most commonly used font on web pages. In most browsers, the default font is Times New Roman, which is a good serif font for printed matter but problematic on screen due to the much smaller resolution. Suitable serif fonts that work both on screen and on paper include Cambria, Georgia, Palatino Linotype, and Bookman Old Style. Just remember to list down a few of them in your font-family, in order of your pref­er­ence, since none of them them is universally installed on computers.

Hint: after writing a CSS rule like
body { font-family: Cambria, Georgia, Palatino Linotype, Bookman Old Style, serif; }
use Firefox with Web Developer Extension to view your page, then select “Edit CSS” in its “CSS” menu. Modify the style sheet by removing the first font in the list, look at the page, remove the next font, etc. This way you can quickly test the page on all the fonts you suggest, without editing the page itself.

Changing fonts

In good typography, we avoid mixing fonts in text. We can use normal, italic, and bold versions of a font, but not fonts of different design in the same paragraph or other block of text. However, in mathematical texts, it is often more or less necessary to mix fonts, taking letters and other common characters from one font and mathematical symbols from another.

The main reason is that most fonts have a limited character repertoire, as described in section Using special characters. When you need to pick up special characters, you cannot be too picky. In particular, a large number of mathematical symbols can be found in commonly available sans-serif fonts like Lucida Sans Unicode and Arial Unicode MS but not in common serif fonts.

For example, the nabla operator (U+2207) is present in several fonts, but not in any serif font commonly available on Windows systems. Thus, to write the expression f you should use markup like
<font class=nabla></font><i>f</i>
together with a CSS rule like the following:
.nabla { font-family: Arial Unicode MS, DejaVu Serif, Linux Libertine, Lucida Sans Unicode; }

Parentheses

While it has been conventional to some extent to use different characters for nested parentheses, using ( ), [ ], { }, and then angle brackets, such practice is not endorsed by standards. Quoting ISO 80000-2:

It is recommended to use only parentheses for grouping, since brackets and braces often have a specific meaning in particular fields. Parentheses can be nested without ambiguity.

Thus, angle brackets should only be used in specialized meanings, such as L² inner product of functions, or maybe for an arithmetic mean if the primary notation (line over) is not applicable. Instead of usage like [(a + b)/c]², normal parentheses should be used: ((a + b)/c)².

If angle brackets are used in math, then they should, according to the standard, be MATHEMATICAL LEFT ANGLE BRACKET U+27E8 and MATHEMATICAL RIGHT ANGLE BRACKET U+27E9. The HTML entities &lang; and &rang; denote other characters, U+2329 and U+232A. While they “work” more often than the correct characters, “working” here means just getting displayed in some odd font.

It is not appropriate to use the less than sign (<) and the greater than sign (>) as angle brackets. In mathematical texts, their usage should be limited to the relational meanings (though the relation could of course be other than the common ordering relation). It should never be a matter of glyph preference which character you use, though in this imperfect world, violations of this principle are sometimes understandable and foregiveable.

The Unicode standard says that the use of U+2329 and U+0232A as mathematical brackets is “strongly discouraged, because of their canonical equivalence to CJK angle brackets. This canonical equivalence is likely to result in unintended spacing problems if these characters are used in mathematical formulae.” In practice, when you use these characters, they will most probably be picked up from a font designed for Chinese-Japanese-Korean (CJK) “ideographs”, therefore designed to fit into a largish square, typically causing typographic mismatch. On the other hand, font support to the correct mathematical angle brackets is still rather limited, so avoid them unless the contents absolutely needs them.

There’s a special of oddity with the entity references for angle brackets. Even though HTML specifications clearly define &lang; and &rang; as &#9001; and &#9002;, i.e. as denoting U+2329 and U+232A, most browsers treat them as denoting U+27E8 and U+27E9. Strangely enough, in this issue, IE seems to be the only browser that works by the specifications. HTML5 drafts have silently changed the meanings of &lang; and &rang; to correspond to the behavior of most browsers. Conclusion: Avoid entity references. By using character references, you will at least know which character will be used, even though you still have all the font problems.

Fractions

Simple linearized notation

For fractions like 6/7, the common linearized notation is usually the best, especially within running text. It is not typographically good, but it is robust, and people are accustomed to such simple presentations of fractions on web pages.

Two-dimensional style

In two-dimen­sion­al display formulas, even fractions can be shown using a horizontal line, with a number above and below it, but inside text, it’s hardly feasible—the numbers would need to be such a small size that legibility would be poor. The following line illustrates an attempt at such presentation, with various compromises and relatively complicated HTML and CSS, and therefore with some fragility:

Is 1 / 3 or ⅓ better?

Fraction characters

If the only fractions in your document are vulgar fractions ½ ¼ ¾, you might consider using the ISO Latin 1 characters for them, e.g. as entity references &frac12;, &frac14;, &frac34;. But if you need other fractions too, this is not a good idea, since it would be odd if different fractions had essentially different appearances.

In Unicode there are a few more fraction characters (namely for 1/3, 2/3, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 1/8, 3/8, 5/8, 7/8), in the Number Forms block, but it would still be a limited repertoire. Moreover, although they are covered by many fonts, the font support is far from universal. The newest fraction characters, namely those for 1/7, 1/9, 1/10, and 0/3, have very limited font support (only a few fonts, none of which is shipped with any operating system or popular software).

Constructing a fraction

You might use a linear notation with sup markup for the numerator and the sub markup for the denominator. The main problem is then that an expression like 5/8 tends to cause uneven line spacing, due to the poor quality of implementation of superscript and subscript style in most browsers. It is therefore better to use CSS to reduce font size and change vertical position.

You might also consider using the fraction slash (U+2044) character which should, accord­ing to the Unicode standard, solve the problem for numeric fractions in an elegant way. That would mean something like 5&x#2044;8 in HTML. The fraction slash character is often more slanted than the normal slash (solidus) character. This is intended to correspond to special rendering where the numbers around it are in reduced size and vertically positioned in a manner that reflects a traditional way of writing frac­tions. But browsers do not currently do such things, and this may result in unsatisfactory rendering: the fraction slash appears between normally styled numbers (5⁄8), possibly touching them, depending on font (e.g., in Arial, 5⁄8 looks bad. Although this could be alleviated by setting letter-spacing, it’s more natural to try to imitate the traditional fraction appearance, using CSS.

Using OpenType features

Some fonts (currently, mostly the so-called Microsoft C fonts like Cambria) contain information for constructing fractions using special shapes and positioning of digits and the slash. Using so-called OpenType features, such construction can be asked for.

On web pages, contain superscript variants of glyphs, typically for digits, lowercase letters a–z, and a few operators. it has become possible to utilize OpenType features using the CSS property font-feature-settings. Browser support is becoming more widespread.

Using this approach, the fraction is written in simple linear notation but wrapped in an element for which the OpenType feature "frac" is requested for.

OpenType also defines the feature "afrc" for alternative fraction format (typically, with horizontal line, not sloped fraction slash). It is however supported in even fewer fonts than "frac".

Using MathJax

It is easy to create fractions using MathJax, with the \frac command. However, it creates a fraction with numerator and denominator stacked, with horizontal line between them. Such a presentation is usually OK in display formulas, but less so in text. About tuning, see the Q/A pages LaTeX force slash fraction notation and How do I typeset arbitrary fractions like the standard symbol for .5 = ½?

MathML

In MathML, a fraction can be described as a special case of a fractional expression, using the mfrac element. The code is verbose, but a more serious problem is that not all browsers support MathML, especially when embedded in an HTML document. On non-supporting browsers, the code degrades to a rendering like “5 8” instead of “5/8”.

Summary

To demonstrate what the different approaches yield on your current browser with its current settings, here is a table of different presentations for 5/8:

A fraction presented using different techniques
Approach Notation in HTML document Appearance
Linear notation 5/8 5/8 text
Special character &#8541; ⅝ text
sup and sub <sup>5</sup>/<sub>8</sub> 5/8 text
Fraction slash 5&#8260;8 5⁄8 text
Fraction slash and CSS <font class=num>5</font>&#x2044;<font
class=denom>8</font>
58 text
OpenType "frac" <span class=frac>5/8</span> 5/8 text
MathJax \(\frac{5}{8}\) \(\frac{5}{8}\) text
MathML <math xmlns="http://www.w3.org/1998/Math/MathML"> <mfrac bevelled="true"> <mrow> <mn>5</mn> </mrow> <mn>8</mn> </mfrac> </math> 5 8 text

The style sheet used is the following:

.num, .denom { font-size: 70%; }
.num { position: relative; bottom: 0.5ex; left: 0.2em; }
.denom { position: relative; left: -0.05em; }
.ofrac {
-moz-font-feature-settings: "frac";
-webkit-font-feature-settings: "frac";
-ms-font-feature-settings: "frac";
font-feature-settings: "frac";
}

There can be line breaking problems with the “/” character as well as the fraction slash, though currently on minority browsers only. To stay on the safe side, you could use markup like <span class="frac">5/6</span> for fractions and use the style sheet rule .frac { white-space: nowrap; }.

Underlines, overlines

Underlining

To underline something, you could use the u element in HTML. However, underlining is widely taken as indicating a link on the Web. Links want to be links, and you should avoid doing anything that might make something look like a link if it isn’t. But if underlining a symbol is part of an established tradition in some field, go ahead and use the u element. It would be less logical to use the CSS declaration text-decoration: underline, since here underlining is not just a suggestion on rendering style but an essential feature of content.

As an alternative, you could use the combining low line character (U+0332) after the symbol to be underlined. However, this character appears in a few fonts only and does not nec­es­sar­i­ly produce any better rendering. A simple test (with an underlined character and a symbol with combining low line): x, .

Overlining

It is common to use overlining in mathematics, e.g. to indicate an average. Somewhat illogically, HTML has no markup for overline. In a style sheet you could suggest overlining, using the declaration text-decoration: overline. However, as the property name says, it’s assumed to be decoration, not part of the content proper, and in any case style sheets are for suggestions only; you should expect style sheets to get ignored fairly often. Pre­sen­ta­tion­al­ly, note that the overline appears rather high above the symbol.

Overlining something like x might be adequate if the context or explicit explanations make it clear what is meant even if the overline does not appear. For casual overlining a single symbol, you could use an embedded style sheet as follows:
<b style="text-decoration:overline"><i>x</i></b>.

If overlining is essential, consider using the combining overline character (U+0305) after the symbol to be overlined. There are risks with fonts, but in most browsing situations, this method works. The rendering varies but is generally much better than in the CSS methods, as the overline is close to the letter. A simple test, first with a CSS-overlined letter, then a letter with combining overline: x, .

Radicals

For radicals (expressions of roots), it is customary in typeset mathematics to use a vinculum it more evident what belongs “under the root”. The vinculum is a horizontal line that joins with the radical sign, and the joining is difficult to arrange without using specialized software that draws math expressions.

You might consider the following options:

You might suggest overlining to make to produce a sort of vinculum:

(a² + b²)

This uses simple markup where the expression in parentheses is enclosed between <span style="text-decoration:overline"> and </span>.

In this context, the relatively high vertical placement of the overline does not disturb. It is even desirable, and perhaps an even higher placement would be desirable. Things may get distorted on many browsers if there are exponents written using sup under the root.

In our example, the parentheses are redundant when overlining is applied. I have experimented with tricks which would put the parentheses inside span elements with style sheets suggesting the suppression (display:none) of them in presentation.

Instead of text-decoration: overline, you might set a top border for the radicand, e.g. .radic { border-top: solid 1px }. This seems to produce reasonable presentation even when there are exponents and subscripts (sup and sub elements) in the radicand, as the following example illustrates:

d = √[(x2x1)² + (y2y1]

What about roots other than the square root? There are Unicode characters for the cubic root and fourth root symbol, though they are less widely supported than the square root symbol. For a general root, you might put the radical index right before the radical symbol, in superscript style. Besides, you could use CSS to suggest reduced spacing between those characters. This would mean HTML markup like the following:
<span class="radic"><sup><var>n</var></sup>&#8730;</span><span class="radicand"><var>x</var></span>
with CSS like the following:
.radic {letter-spacing:-0.15em; }
.radicand {text-decoration:overline; }

This looks like the following in your current browsing situation: nx.

Arrays and tabulations

HTML tables are intended for presenting data which is tabular. We will not discuss here the tabulation of numeric data in general, since the basic HTML constructs are simple and the fine tuning, using attributes in HTML markup and/or style sheets, is beyond our scope. But it needs to be noted that numeric data should normally be right-aligned, which is not the default alignment in HTML tables, so you often need the align="right" attribute (in td or tr elements). It would often be desirable to align numeric data on the decimal point, but this is in practice not possible in the way defined in the HTML 4.0 specification. Instead, some tricks are needed, such as using a monospace font and right padding with no-break spaces so that the items in a column have the same number of characters to the right of the decimal point.

For a matrix, the conventional notation in mathematics is to use large parentheses around. This would be rather difficult in HTML and would work well for very small matrices only (cf. to methods discussed in the Towards two-dimensionality section below). It’s probably best to use a different presentation which makes matrices have an appearance which is suitably distinctive, such as a special but not too striking background color for cells. You might write, say, <table class="matrix"> for any table which presents a matrix, and include a style sheet rule like
table.matrix td { background: #fda none; color:#000; }
This results in something like the following on your browser (when a cell spacing of 4 pixels and centering of cell contents has been suggested too):

A =
x-y -a 42
c a*b c

In the example above, the table representing a matrix has been embedded into an outer table so that we have been able to associate a symbol with the table. Similar techniques can be used e.g. when you wish to present the sum of tables; you would write an outer one-row table which contains the matrices in its cells and a plus sign in a cell of its own between them. The following example illustrates this:

15
37
+
22
22
=
37
59

Subscripts, superscripts, and exponents

Alternative methods for displaying superscripts

A superscript can be presented in an HTML document in several ways:

The following table illustrates the approaches. The last may show just normal characters instead of superscripts, if your computer lacks the Cambria font or your browser does not support access to superscript variants at the font level.

For subscripts, the situation is rather similar. However, there is a more limited set of subscript characters than superscript characters.

HTML markup for subscripts and superscripts: Style or essence?

The HTML language has the sub and sup elements for subscripts and superscripts. But they should primarily be regarded as stylistic suggestions only, rather than as essential parts of the notation. (See my notes on the intended use of sub and sup.) Naturally they can be valuable for “styling” math, too. To quote the descriptions of sub and sup in WDG’s HTML 4.0 Reference, replacing their markup examples with their appearance on your browser:

Since SUB is inherently presentational, it should not be relied upon to express a given meaning. However, it can be useful for chemical formulas and mathematical indices, where the subscript presentation is helpful but not required. For example:
Since SUP is inherently presentational, it should not be relied upon to express a given meaning. However, it can be useful for mathematical exponents where the context implies the meaning of the exponent, as well as other cases where superscript presentation is helpful but not required. For example:

However, especially when superscripting is used to express exponentiation, superscripting is essential, and there need not be any contextual hints. It really makes a difference if 109, which is intended to mean 10 to the power 9, actually gets displayed as 109. The same applies to using superscripts e.g. in denoting the transpose of A by AT (i.e., A immediately followed by T in superscript style).

Superscripts are often used for footnote references in print media. But in mathematical texts, such practice is best avoided in order to remove any risk of confusing such usage with exponentiation or other mathematical superscripting. Besides, such footnote references don’t work well on Web pages in general, as explained in the document Footnotes (or endnotes) on Web pages.

Superscripts and subscripts as two-dimensionality

Two-dimensionality in formulas will be discussed later, but here we mention some possibility of using superscripts and subscripts to simulate notations that should really be written above and below a symbol. The following example (which also uses special characters) shows the markup for an infinite sum and the resulting appearance on your browser.

&#8721;<sub><var>i</var>=0</sub><sup>&#8734;</sup>
<var>x<sub>i</sub></var>
i=0 xi

That means summation of xi (with i as subscript) from i=0 to infinity. In browsing situations where the infinity symbol is correctly displayed, the main problem on most browsers is that the infinity symbol does not appear above i=0 but to the right of it (in superscript style though). In the worst case, the reader might misunderstand the upper limit as an exponent of the lower limit! So it is perhaps better not to use a superscript at all but put the limits into a subscript, e.g. as i=0,…,∞ (which makes use of the horizontal ellipsis character; midline horizontal ellipsis would be better, but it’s less widely supported) giving the following appearance: ∑i=0,…,∞ xi.

The infinity symbol ∞ might appear in fairly small size. In general, special symbols easily become unreadable when font size is reduced, so you might consider setting the font size larger than normal. In the above example with HTML markup for a formula and the formula itself side by side, the font size for the formula has been set to 125%.

The presentation of a summation expression could be tuned in different ways, some of which will be discussed in the sequel. But generally they lead to rather complicated constructs, and the complexity may cause problems on different browsers, current and future. However, some simple superscript positioning problems can be addressed relatively easily. Let us take the example of just positioning a simple one-character superscript above a one-character subscript.

In chemistry, sometimes both subscripts and superscripts are used, for example in formulas for ions. Consider the formula
NO<sub>3</sub><sup>−</sup>
where letter O has both subscript “3” and superscript “–”. The latter is the minus sign, and the Ascii hyphen is a particularly poor surrogate here due to its shortness; such problems were discussed in the section on special characters. It seems that the stylistically preferred notation for ions has the superscript in the same horizontal position as the subscript. See, for example, the Ions page in Eric Weisstein’s World of Chemistry, which uses images to create such appearance.

The markup mentioned above by default creates an appearance where the superscript is on the right of the subscript (O3). Changing the order of the superscript and subscript would not help much. But we can try to affect the horizontal placement by using a negative margin. Since, in general, the notation for an ion always has a superscript and may or may not have a subscript, it seems practical to put the sup element first and move the subscript to the left. This would mean markup like
<span class="ions">NO<sup>&minus;</sup><sub>3</sub></span>
(or maybe with div instead of span) and a style sheet like the following:
.ions { line-height: 1.8; }
.ions sub { margin-left: -1ex; }

and it would result in the following on your browser: O3.

Creating good appearance for variables with both a subscript and a superscript is rather challenging: fine-tuning is needed, and the rendering will still greatly depend on the font used. Beware that widths of characters vary by font, so a horizontal shift created by margin-left or some other method might be adequate for some fonts and poor for others. Moreover, it cannot as such be used in conjunction with the method of making line spacing even that will be described later in this document.

The margin-left property effectively shifts the subscript to the left. The line-height property is useful for defeating some IE bugs – and fairly natural, since you are using quite some height here, more than we can expect to be available by default. Note that if an element in class ions contains a lone superscript, you would need to take extra measures, since the CSS code above postulates that a sup element appears immediately after a sub element.

From the HTML perspective, the basic problem in situations like this in that the sub and sup elements have been defined in a rather presentation-oriented manner rather than struc­tur­al­ly. When you write <sub>i</sub>, you're saying that “i” is a subscript but not what it is associated with. In a case like “ai2”, “i” is a subscript for “a” whereas “2” is a superscript (exponent) for the expression consisting of “a” with subscript “i”. There is no way to express this structural relationship in HTML. Using extra parentheses, like “(ai)2”, deviates from common mathematical notations and looks somewhat clumsy, but it makes the meaning unambiguous and clear.

Uneven line spacing

As you may have noticed on many web pages, subscripts and superscripts tend to mess up line spacing. For example, a superscript expression like AT makes the line have more than normal vertical spacing above it. The reason is that subscripts and superscripts may increase the vertical space needed for a line, and browsers quite naturally increase height of a line box (making it larger than the value of the line-height property).

The simple solution to this problem is to use a style sheet that positions subscripts and superscripts vertically using relative positioning, instead of the vertical-align property. This prevents the effect that makes some lines higher than others in the same paragraph.

The method is described in more detail in the document How to prevent uneven linespacing when subscripts or superscripts are used on web pages.

Size of subscripts and superscripts

It is usually best to avoid setting the font size of sub and sup elements. The reason is that IE has a longstanding bug, with little hope of fixes:

It looks like IE (all versions till IE9) multiplies the font size of the <sub> and <sup> and their descendants with some variable coefficient (sth between 0.6 – 0.8 depending on the font-size).
stackoverflow: relative font-size of <sub> or <sup> and their descendants in IE

Even though it might seem suitable to set the font size to achiever similar sizing across browsers, it has just the opposite effect. If you don’t set it, browsers generally apply a size reduction (by about 80%) fairly consistently. But if you set font-size on sub or sup, IE will interpret it differently from other browsers.

If you really need to set the font size of subscripts or superscripts, you have a few options, like the following:

Nested superscripts

Most browsers render superscripts properly, or at least tolerably well. But they often fail to handle nested superscripts (or subscripts) well. And that means that exponentiation in an exponentiation may get lost in graphic presentation, too. For example, some old versions of Internet Explorer render a<sup>b<sup>c</sup></sup> the same way as a<sup>bc</sup>. However, modern browsers, including reasonably new versions of IE, honor nested superscripts in rendering, Here is test for your browser: abc (the letter c should be a superscript of b).

The preceding paragraph may illustrate the problem that nested superscripts easily cause problems by (almost) hitting the preceding line. small font. One approach (perhaps observable in this paragraph) is to set the line-height property (in CSS) to a relatively high value like 1.6. Moreover, it might be a good idea to set line-height globally (for the body element) to a value like 1.3, since this helps with some of the smaller problems of uneven line spacing and may improve readability in general.

Avoiding superscript problems

There are various ways to avoid the problems with superscripts by using other notations:

The superscript and subscript problems can also be seen as special cases of a wider problem: how to present mathematical expressions in the conventional two-dimensional format?

Equation numbers

In mathematics, it is common to number equations and put the number on the right of an equation, in parentheses or brackets, as follows:

(a + b)2 = a2 + 2ab + b2 (42)
ex ≈ 1 + x + x2/2 + x3/6 + x4/24 + x5/120 + x6/720 + x7/5040 + x8/40320 + x9/362880 + x10/3628800 (43)

Several approaches have been proposed to achieve such layout using just CSS and no presentational markup. However, the CSS methods (whether based on floating or on positioning) seem to suffer from various problems on current browsers. Some methods work well if the equation fits on one line but lead to confusion when it is divided over two or more lines. Since we wish to create pages that adapt to varying rendering widths (“fluid design”), a simple table is the practical solution:

<table class="eq" summary="Equation and its number." width="100%">
<tr>
<td>the equation</td>
<th align="right" valign="bottom">(number)</th>
</tr>
</table>

If you regard such a table as a deprecated “layout table”, consider its markup, specifically the summary attribute and the use of th (table header cell) element for the equation number, which logically acts as a header for the row (the equation).

You can get rid of the presentational attributes width, align and valign by using cor­re­spond­ing CSS properties. If you like prevent the equation number from appearing in bold, you should either replace (somewhat illogically) the th element by a td element, or use the font-weight property in CSS. As a whole, this could mean the following style sheet:

.eq { width: 100%; }
.eq th { text-align: right;
         vertical-align: bottom;
         font-weight: normal; }

The following equation has been formatted with such a style sheet (and its HTML markup is “pure”):

(p + q)(r + s) = (p + q)r + (p + q)s = pr + qr + ps + qs (44)

Towards two-dimensionality

For simplicity, let us first assume that we wish to present the expression x divided by a − b in the conventional two-dimensional format. In this trivial case, the linearization x/(a − b) would do just fine, but in more complicated cases, two-dimensionality would greatly improve the clarity. Using an image is one possibility, and if the linearized version isn’t too com­pli­cat­ed and you include it as the textual alternative, it might work fine. But let’s see some other possibilities.

Preformatted text for two-dimensionality

One might present the expression as two-dimensional preformatted text and include it using the pre element. This would be rather simple in our trivial case:

  x  
-----
a - b

In more complicated cases, you could use a sort of Ascii art like the following:

         b           
         /   f(x)    
         | ------- dx
         /  1 + x    
         a           

Several mathematical programs can format expressions that way for you, and you could just cut and paste them. Note that some markup, such as i for italics, is allowed within pre elements. And you need not be limited to Ascii; you could even use the special characters outside ISO Latin 1, in principle, though with special problems. Example (where the integral sign may or may not display correctly):

         b
              f(x)
         ∫  ------- dx
             1 + x
         a

In any case, the visual quality of this method cannot be very impressive. Moreover, it creates accessibility problems, since it’s gibberish unless seen in the exact preformatted way.

Using tables for two-dimensionality

A table might be used to make an expression appear two-dimensionally. Such an approach could be seen as avoidable “use tables for mere layout”, though in a sense a table construct would reflect the structure of data, e.g. in


<table cellspacing="0" cellpadding="0">
<tr><td align="center"><i>x</i></td></tr>
<tr><td valign="middle"><img src="1px.gif" alt="divided by"
 width="100%" height="1"></td></tr>
<tr><td align="center"><i>a</i> &minus; <i>b</i></td></tr>
</table>
The only thing that’s really a trick there is the use of a black single-pixel GIF, stretched with the width attribute to a horizontal line. The “table” displays as follows:
x
divided by
ab

Reasonable appearance might be achieved that way, with perhaps tolerably graceful degradation in text-only media: a simple character cell browser like Lynx would basically display it as
x
divided by
a - b

which might be understandable. But designing and writing suitable table markup would be rather awkward for nontrivial expressions.

It would be simpler and more natural to have a table with two rows only, using a bottom border for the first cell. The border would create a suitable horizontal line. However, in text-only presentation and in nonvisual presentation, the data would appear as
x
a - b

which can be difficult to interpret. This could be partly addressed by using a summary attribute for the table, e.g. <table summary="This is a fractional expression, with the numerator in the first cell, the denominator in the second">.

Let’s see what we could do with the integral above. Using a bit contrived table markup, we might get something like the following:

b
f (x)  dx
_____
1 + x
a

In any simple text presentation, it would look rather awful, though. You might consider providing a separate link to an alternative presentation, for such reasons.

For a rather common case of a definition that is most naturally presented in two lines, this approach might work tolerably, for simple definitions:

δij = {
1, if i = j
0, if ij

Returning to simple examples, let us consider how we might present (a − b)/x two-dimen­sion­al­ly using style sheets, specifically the display property in CSS1 (cf. to ideas about removing redundant parentheses above). We would start from the simple linear notation (a-b)/x. We would put the parentheses and the slash each inside a span element containing that character only, and we would also use span for the numerator and denominator. Then, using class names assigned to the span elements, we would suggest in CSS the suppression of the display of those characters, presenting both the numerator and the denominator as a block, and underlining the numerator. This means markup like the following:

<span class="nom"><span class="lin">(</span><i>a</i> &minus;
<i>b</i><span class="lin">)</span></span><span
class="lin">/</span>
<span class="den"><i>x</i></span>
with the following style sheet:

.lin { display: none; }
.den, .nom { display: block;  width:100%; text-align:center }
.nom { text-decoration: underline; } 

This is what looks like on your browser:

(ab)/ x

And it degrades to (ab)/x in non-CSS browsing situations.

We might still add
.den { line-height: 0.65; }
to reduce the spacing between the line and the denominator, so that the expression would look more natural. The value 0.65 is a compromise. On many browsers, it doesn’t improve things much, and a smaller value like 0.5 would be better, in a case where the denominator has just letters with no ascenders. But there is the risk that on some browsers, a small value chops off the top of the text in the denominator. It gives the following appearance on your browser:

(ab)/ x

If the denominator is wider than the numerator, you would like to overline the denominator instead of underlining the numerator.

Using HTML5 for two-dimensionality

In HTML5, the canvas element lets you write text to specific positions, by pixels. This not supported by IE before version IE 5, though tools exist for partially simulating canvas on older versions of IE. The writing operations are based on JavaScript functions to be standardized in HTML5, and the code needed to draw an equation is relatively bulky. In special cases, this may be feasible, especially if you also use canvas for other purposes (like an animation).

Conclusion

We could consider other approaches too, such as using positioning with style sheets instead of tables. But it probably suffices to conclude with the following note:

In HTML, a rich set of mathematical symbols and some other basic notations can be used, but currently with accessibility problems that users need to solve. Two-dimensional presentation of expressions via HTML markup is trickery and handcraft, now and in the foreseeable future. Thus, quite often it is better to use JavaScript-based tools or images for such purposes.