This document consists of annotations to HTML 4.0 in Netscape and Explorer, which is a very valuable review of differences between the specification and its implementation in Netscape Navigator 4.0 and Microsoft Internet Explorer 4.0. The review, written by Stephanos Piperoglou and published on the Web by WebReference, is very valuable to Web authors who wish to use HTML 4.0 features. In addition to describing support or lack of support to HTML 4.0 in the two browsers, it contains a well-written discussion of some essential topics of HTML authoring.
As a whole, the document HTML 4.0 in Netscape and Explorer gives a fine overview of its subject area, and it also presents very valuable background information. However, in details there are numerous errors.
The structure of this document reflects the structure of the document being commented:
This document tries to clarify some details and emphasize some issues in the review - as well as point out some errors. (Notes about errors are partly based on practical testing, partly on the documents of the browsers themselves. For Netscape, there is a convenient reference for HTML tags as supported by Netscape 4.0 and earlier but of course we cannot know for sure whether it exactly corresponds to the reality.) Disclaimer: I have not systematically checked the details in the document I'm commenting on; there might well be errors which I have missed.
It is interesting that the document itself does not conform to
the HTML 4.0 specification, for instance due to the lack of
ALT attributes in
This, however, is probably caused by a technical editing process
for making the document conform to their company policies
(which thus would need some revisions).
Notice that WebReference's site is a very useful resource
to Web authors and
contains, among other things,
about Internet glossaries and
Dmitry's design lab.
The headings below are links to the corresponding sections in the document being commented.
The introductory section explains well what the document is intended to cover and what has been left out and why. However, it fails to emphasize things in a suitable manner. It is crucial for a reader to realize what the document is about and what it is not about.
Specifically, it should be emphasized that the document does not discuss "internationalization" issues. It is understandable that they were left out, but it is a very important omission, since "i18n" is one of the important new features, if not the new feature, in HTML 4.0 as compared with HTML 3.2.
Similar lack of emphasis can be observed as regards to browser versions. Although the introduction says that the document only discusses the 4.0 versions of the two browser, a reader - especially a casual reader who just visits a page discussing the support to some specific HTML feature - may easily get the wrong impression.
It should also be noted that Netscape 4.0 and IE 4.0
are not monoliths. There are differences in minor versions and
For this reason, when these annotations say that there is an error
4.0 in Netscape and Explorer,
this is to be taken as referring to the observed behavior
of at least some releases of Netscape 4.0 and IE 4.0.
- Notice, for example, that
HTML Tag Support History
flags several features as being supported by Netscape 4.0 and IE 4.0
from some specific minor version onwards.
And it does not even mention any support in Netscape 4.0 for
OBJECT element! (This probably means
that the first releases of Netscape 4.0 did not support it.)
It is somewhat confusing that the document discusses, in addition to support to HTML 4.0 in the browsers, some elements and attributes which are proprietary in the sense of not being in HTML 4.0. In addition to the section Proprietary Elements, such notes (admittedly interesting per se to some authors) appear in different parts of the document alongside with information relating to the central topic of the document.
This short section contains a nice introduction to some basic characteristics of the HTML language. However, there are some exaggerations:
The HTML 4.0 specification is a huge, intensely technical document that contains hundreds of different elements and can have all sorts of applications. It is at times difficult to understand. This is not an error on behalf of the authors. In fact, very few people are expected to read the specification; HTML was originally intended to be a document format that the author wouldn't learn or see, but something that would be created by programs designed for this task.
First, the specification can be called large, but hardly huge. It does not contain hundreds of elements, at least if the word "element" is taken in the sense that it is used in the specification. It is hardly true that very few people are expected to read it. The specification contains several introductory and tutorial parts - many of them rather good - which obviously imply that it was written for a wide audience. And the specification itself says, in its About the HTML 4.0 Specification part:
This document has been written with two types of readers in mind: authors and implementors. We hope the specification will provide authors with the tools they need to write efficient, attractive, and accessible documents, without over-exposing them to HTML's implementation details.
And who says HTML was not intended to be writeable and readable
by humans? There seems to a rumor that Tim Berners-Lee himself has
said something like that, but if that's true, perhaps someone did not
get the joke. Anyway, the HTML language was from the beginning defined
as being of media type (Mime type)
text/html, and the
choice of major type
text (as opposite to
application) implied a certain commitment.
RFC 1521 (now superseded by
contained the following
characterization of the type
The primary subtype, "plain", indicates plain (unformatted) text. No special software is required to get the full meaning of the text, aside from support for the indicated character set. Subtypes are to be used for enriched text in forms where application software may enhance the appearance of the text, but such software must not be required in order to get the general idea of the content.
And finally, the statement that Netscape and IE "implement most of it [the HTML 4.0 specification] incorrectly" is gross exaggeration. Useful it is may be as a warning to people who have fallen into the widespread propaganda that the version numbers of HTML, Netscape and IE go hand in hand, it simply isn't true. It might be regarded as true if "it" referred to the new features in HTML 4.0, but that's another story.
This section discusses the often misrepresented issue of "standards" in a very enlightening way. However, some descriptions of the background paint the wrong picture. I am specifically referring to the statement that "pages written for Navigator used features that didn't exist in any other browser, and since people wanted them, they used Navigator". This is a confusing oversimplification. In reality, Netscape Navigator became popular because it was the only graphic browser from a large company. One need not postulate that "people" wanted Netscape specific enhancements.
The document recommends HTML 3.2 "for simple applications".
This probably gives the wrong impression. The rational reason
for starting a gradual switch from HTML 3.2 to HTML 4.0 (usually
via HTML 4.0 Transitional) is not
include new features
that might work on Netscape 4.0 and IE 4.0 at least partially.
One part of
the reason is that HTML 4.0, especially in the Strict version,
contains more restrictions, such as the requirement that
IMG elements must contain an
But more importantly, HTML 4.0 allows us to include things like
LANG attribute, which are an investment for the future;
we should expect them to be ignored by current browsers but
supported by future ones. And by using such features with great
potential on one hand and graceful degradation on the other we encourage
the developers of browsers and other user agents to start actually
making use of the enhanced markup.
The statement that HTML 4.0 is "most probably" the last HTML specification is, hopefully, just provocative. The continuation "If all goes well, HTML will slowly die off and give its place to XML, which solves almost all of the problems present in HTML", when written by a competent author, must be some kind of joke.
This section contains very important information under heading which could hardly be less exciting. It warns about the inadequacies of Netscape and IE in the field of basic processing of HTML, at the level of lexical and syntactic analysis. One can refer to this section when one wants to give an explanation of what it means to call those browsers "tag soup" or "tag sallad" browsers.
In the discussion of character references, the document is as confused as the HTML 4.0 specification. It uses the term "character reference" for two things that must be kept as distinct, both for theoretical and for practical reasons:
Although those constructs can be used for similar purposes, they are essentially different from the SGML point of view. Notice that as far as specifications are considered, the numeric character references only depend on the so-called document character set whereas the named entity references depend on separate entity declarations which vary from HTML version to another. Thus, numeric character references work more universally then named entity references. This applies especially to Netscape 4.0. The situation is worth emphasizing, since it is opposite to the idea most people intuitively have if they have understood the basics of character code issues!
The practical recommendation on comments is a good one as regards to the intended meaning, but the formulation is confusing, if not misleading. There is a better formulation in the Web Design Group's Web Authoring FAQ (see answer to question 3).
The discussion of URIs in this section is very confusing. For example, what does it refer to when it says that browsers do "this" automatically? Obviously to encoding characters in URIs - the correct term is "encode", not "escape" - but does it mean that the browsers really do the encoding? The reality seems to be that sometimes they do, sometimes they don't. Anyway, the question is irrelevant in the comparison of the browsers against the HTML 4.0 specification, since the specification imposes no requirements on the processing of incorrect URIs (i.e. URIs which contain characters which may not occur in a URI without encoding).
The paragraph about Frame Target Names seems to be based on a misunderstanding. It complains that "The two browsers both interpret target attributes as window names and not frame names if a corresponding frame doesn't exist". But the observed behavior is just what browsers are required to do by the specification (right where you'd expect to find it, in in the specification of target semantics):
If any target attribute refers to an unknown frame F, the user agent should create a new window and frame, assign the name F to the frame, and load the resource designated by the element in the new frame.
(That is, a new window must be opened, and it will act as a frame
for the purpose of interpreting any subsequent link traversals
TARGET attribute set.)
Consequently, the practical advice to avoid "this technique"
TARGET attribute) "to display documents
in other windows" and to "use a scripting language like
If one wants to have a link which opens in a new
window by default or, as in Netscape in IE, as the only way of following
it, then the
TARGET attribute works more universally than
The statement "Both Netscape and Explorer ignore - - any SGML construct in HTML except for comments" is potentially very misleading. Of course, HTML markup itself is "SGML constructs". One really cannot express things like this very compactly, so it would be wise to refer to subsection SGML features with limited support in the specification.
The document seems to recommend omitting
HEAD tags, by saying that they "can safely be omitted".
Even more strangely, it says that the
HTML tags have no
real use; it actually says "it" has no use in a context where the only
singular correlate would be "the html element" (perhaps
confusing tags and elements).
The recommendation is not wise since the
give the author the best way to specify the overall
language used in the document
Moreover, the use of
HEAD tags makes the structure clearer
to anyone reading the HTML source; authors often have difficulties in
realizing what belongs to the head part and what belongs to the body
part, and explicitly marking up these parts should help in the mental
In the discussion of the
TITLE attribute, there is
a good recommendation
"useful to use title to give more information about hyperlinks
contain a lot of information about the link destination",
but the continuation
"(like the all too common phrase 'please click here')" might be read
as an encouragement to use such link texts and just sugar them with
The document says that it is important (emphasis mine) "to note that the body element can contain character data that is not included in a block-level element such as a paragraph". Notice that it is only allowed in HTML 4.0 Transitional, not in HTML 4.0 Strict, and hardly serves a useful purpose. Moreover, it can confuse things when style sheets are used. (The document sort of says this, but in a somewhat odd way.)
The discussion of whitespace is confusing. It first says that Netscape and IE "support whitespace rules with the exception of the rule that whitespace immediately after a start tag or before an end tag should be ignored". This implies that they break the rules - notice that it is a "shall", not "should" - and they actually do. But then the continuation "this should not be a problem as long as you code according to the specification" does not make any sense. For a short discussion of the real problems Netscape and IE cause in this area, please refer to White Space Bugs in Browsers by E. Stephen Mack.
The information about the presentation of the
DFN element is incorrect: Netscape
presents its content as normal text
(as explained in section
DFN - Defined Term in the
HTML 4.0 Reference by
The statement that "subscripts and superscripts are by default rendered at the same size as the surrounding text" is incorrect. Both Netscape and IE present them in a smaller font.
The statement which says that IE displays a tooltip
ACRONYM element when it has a
element does not seem to be true generally.
As regards to hyphenation, it is true that the soft hyphen is treated as a normal hyphen by both browsers. However, as a simple test shows, IE may break a line at a hyphen (normal or short).
PRE element as supported by Netscape,
COLS attribute does not correspond
WIDTH attribute as defined in the specifications.
Instead, it instructs Netscape to wrap lines if the specified number
of columns is exceeded, quite contrary to the
basic meaning of
Although it is true that IE supports the
the support is rather simplistic and potentially
misleading especially for the former.
IE uses (by default)
to indicate insertion (
which may very easily make people regard it as a link, due to the
very common practice of using underline for links. Thus, the
implementation is hardly satisfactory; notice that the HTML 4.0
specification says that
"user agents should render inserted and deleted text
in ways that make the change obvious" (emphasis mine).
The statement that
"lists are supported by both browsers exactly as the specification states"
should be taken with some salt.
People who programmed the browsers never bothered trying to
in the manner suggested in the specifications.
HTML 4.0 specification now says, with resignation:
The DIR element was designed to be used for creating multicolumn directory lists. The MENU element was designed to be used for single column menu lists. Both elements have the same structure as UL, just different rendering. In practice, a user agent will render a DIR or MENU list exactly as a UL list.
I'd say that
the difference between
MENU was structural,
reflecting substantial differences in what kind of a list
we have. It's probably mostly due to lack of adequate implementations
that this issue has now been turned into a purely presentational one.
Similarly to the general statement about HTML 4.0 support in the browsers in section Of HTML and learning it, a claim is made that "Netscape supports almost none of [the HTML 4.0 Table model]". Here, luckily, this is immediately followed by a statement which indicates that the author is referring to the novelties of HTML 4.0 as compared with HTML 3.2.
It is well known that there are serious bugs in the implementation of table rendering in the browsers. For example, the Netscape bug with nested tables (often completely incorrect rendering when omissible end tags are omitted) still prevails in Netscape 4.0.
It is true that
"Netscape also supports the nowrap attribute
to table cells as defined in the HTML 4.0 specification",
but since this appears in a paragraph which
begins "Netscape supports one or two additional features on top of the HTML 3.2 Table Model",
it must be noted that the
nowrap attribute was defined
The statement which says that IE supports the
TFOOT is technically correct in the sense
that it recognizes the tags and is able to present the
part after the content of the table. However,
both the heading part and the footer part are rendered similarly to
normal table rows, without any scrollability with fixed headings on
screen and without any repeated headings on new pages on paper output.
is rather far from the intended implementation;
the HTML 4.0 specification says (in fact twice) the
Table rows may be grouped into a head, foot, and body sections, (via the THEAD, TFOOT and TBODY elements, respectively). Row groups convey additional structural information and may be rendered by user agents in ways that emphasize this structure. User agents may exploit the head/body/foot division to support scrolling of body sections independently of the head and foot sections. When long tables are printed, the head and foot information may be repeated on each page that contains table data.
Since simply ignoring the
tags (but not their content), which is what Netscape probably
does, involves graceful degradation for
TFOOT, we could draw the
following practical conclusion:
it is safe, and advisable, to use
but it is probably best to avoid using
The document does not discuss the problems that have been
observed in the implementation of links
in some versions of the browsers at least. These include the failure
to support an
A NAME element with empty content
and failures to follow properly a link with a target anchor
inside a table.
The remark that the
TITLE attribute for an
A element "is very useful when referring to documents
without specifying their full title" is valid, but the parenthetic
remark '(like the all too common
"please click here")' is misleading.
It is easy to see it is a suggestion to
keep using "please click here" and just
The statement "Sadly, the link
element remains largely unimplemented"
is true but the explanation
"mostly because there was never a concerned
effort to define standard link types before the HTML 4.0 specification"
is questionable. First, the main reason is that Netscape and
Microsoft did not want to implement the element even
in a very simple and obvious way - such as constructing buttons
LINK elements, using whatever
REV) value there is.
Second, there were many "concerned efforts". Third, the HTML 4.0
specification does not change the situation very much; it is about
as vague as the HTML 3.2 specification in this respect,
and the list of link types in HTML 4.0 specification
is wanting - it does not even contain a value indicating
a simple and very common link upwards in a hierarchy
Up in some proposals).
As I remarked already in
my comments on the HTML 4.0 draft:
The draft lists some values without making any requirement on or even suggestion to supporting them. Authors "may use" some "recognized" link types which are claimed to have "conventional interpretations".
The full heading of this section continues with "Make it look nice". This suggests a rather one-sided view on the use of images and other illustrations.
The statement that "the img
element is supported as stated in the specification by both browsers, with the exception of
the longdesc attribute"
can probably be regarded as technically true.
However, there are serious flaws in the quality of implementation.
The presentation of
when browsing with image loading off is often very poor.
In particular, if the
attributes are specified, they determine the area reserved for the
image, and if the
ALT text does not fit there,
the browsers make no effort to fix the situation.
Moreover, when those attributes are used to scale an image,
the quality of the scaled image is often very poor.
This suggests that authors should consider
defectively implemented and use them with caution.
The description of the implementation (or, rather, lack of implementation)
OBJECT element is mostly correct and enlightening.
However, the wording is partly sloppy. One should not characterize
OBJECT element as
"the most reliable way to embed objects in HTML documents to date"!
(The intended meaning is probably that it would be the
most reliable if implemented adequately.)
Antti Näyhä has composed a detailed
OBJECT implementation on a set of
browsers, including IE 4 and Netscape 4.
In the statement "The applet element is supported - - except for the object element" the latter "element" should read as "attribute".
This short section discusses only support to style sheet related HTML constructs, not support to style sheets themselves. This is natural due to the purpose of the document, but the short note about stylesheet support might give too optimistic a view: "Netscape and Explorer support stylesheets to a limited degree."
In reality, the support is not only limited but also buggy. See e.g. CSS References by WDG for information about the status of CSS support. In particular, there is a detailed Style Sheets Compatibility Chart by Web Review.
The following statement is oddly formulated:
The contents of the noframes element are ignored by both browsers, so it is a good idea to follow the specification when including such an element in a frameset document.
Of course, the reason for including a
(which is strongly recommendable, even required by the
is that there are browsers which are capable of presenting
its content, either as the only alternative for pages with frames
or according to user options (since there are browsers which
are both frames-capable and nonframes-capable).
The fact that Netscape and IE do not render the content of
NOFRAMES simply implies that an author need not worry
about causing problems to users of these browsers when he writes
ACCEPT attribute for the
element, in addition to being not supported by Netscape and IE, is
not even included into the formal syntax (DTD) in the
HTML 4.0 specification. This is probably an oversight. Notice that
due to this, the W3C validator
gives an error message if a
FORM element contains an
The discussion of the
WRAP attribute has a typo:
"wrap=soft or wrap=physical"
"wrap=soft or wrap=virtual".
But more importantly, it seems that the basic content here is wrong;
both browsers support
The major difference is that
wrap=soft is the default on IE
whereas Netscape has the correct default, no wrapping (which can be
explicitly specified as
wrap=off. For more information, see
How to limit the number of characters entered in a
in an HTML form, especially section
Implementations, especially wrapping.
This overview of proprietary elements recognized by Netscape and IE might be interesting to people who have to maintain pages written using them. However, despite the adequate warning which discourages their use, some formulations (e.g. "it is useful to note that multicol elements can be nested) might be read as encourageing, not discourageing.
Originally written in June 1998. No major changes made after that,
a link was added
and the presentation was (hopefully) made more pleasant by adding
some styles and and a table of content, and a major correction
TEXTAREA was added.
Date of last update: 2000-02-18Jukka Korpela, firstname.lastname@example.org