This document has been preserved for historical reasons only. It describes my experiences with the paid version of Britannica Online in 1995, when I worked at Helsinki University of Technology. Quite a lot has changed after that. The links in this document probably don't work, mostly.
Since this document is aimed not only to Encyclopædia Britannica Inc as feedback but also to various institutions considering subscription to Britannica Online, a short introduction to Britannica Online is provided.
Britannica Online is a network-accessible hypertext version of the well-known Encyclopædia Britannica. It also includes Merriam-Webster's Collegiate Dictionary and the Britannica Book of the Year. Britannica Online is copyrighted by Encyclopædia Britannica, Inc.
The idea of having a hypertext version of a high-quality encyclopædia on the net is very exciting. Consulting a large encyclopædia is something one would really like to do on a computer in a hypertext fashion, with fast access to information, computer-based searching tools, and hypertext links to follow. Such possibilities have existed for a few years in the form of encyclopædias on CD, but such an approach requires special equipment for the user and, more importantly, does not allow the information to be really up-to-date. In a network version, there is practically no delay between the time of updating the information by the information provider and the time of accessing the updated information by the user. It's all instantaneous. Moreover, a networked version can easily be linked to other sources of information on the network.
Technically Britannica Online, hereafter abbreviated BO, is a collection of files on the World Wide Web (WWW) system, accessible with any WWW browser such as Mosaic, Lynx, or Netscape. The files reside on a single server, www.eb.com. This of course implies limitations which may turn out to be difficult if (or when) the use of BO becomes extensive. It is not known whether BO is easily scaleable by creating mirror servers.
However, the accessibility is restricted by arrangements related to the commercial nature of BO. There is a freely accessible description of BO (with demos) at URL http://www.eb.com/ but access to BO itself must be based on a payed contract (subscription).
However, it is quite odd that the pricing information is not on the publicly accessible pages. A potential customer would certainly like to have some idea of what a subscription costs.
The demos on the public pages only give a rough idea of what BO really is. For a genuine evaluation, more realistic use of BO is needed. Fortunately the company allowed a test (preview) period for Helsinki University of Technology where there was considerable interest in BO, both in the Computing Centre and in the Library as well as elsewhere. This document is mostly based on experiences from the test period. Most of the testing was made using a graphical WWW browser (X-Mosaic), but the service seems to work with text browsers (such as Lynx) quite well, too.
Search Britannica Online there.
It is also possible to select by directly the corresponding URL,
http://www.eb.com:180/cgi-bin/g/Articles/HTML/0/ebonline/http/eb.html?Mode=Fbut the length of the URL makes this a bit awkward. (Moreover, one cannot give directly the URL as an argument to the command for starting a WWW browser under Unix, since the question is a special character in normal Unix shells; the argument must be quoted.)
Having entered BO itself one can of course put the document into one's hotlist, to make it easier to access it in later sessions.
The document mentioned above is a fill-in form with two possibilities:
If one gives, for example, the word
Finland, one gets a list of items relevant to Finland,
and the first of them is a link to rather extensive information about
the country. However, one gets a bit confused by the fact that it does
not look like a normal encyclopædia entry. Basically the page contains
a list of contents, from which one can select the items one is
interested in. However, the first items on the page are titled
MICROPAEDIA MACROPAEDIA BRITANNICA BOOK OF THE YEAR 94 STATISTICAL INFORMATION: see BRITANNICA BOOK OF THE YEARand at this point the user probably wonders what Micropaedia and Macropaedia are. Thus, the user interface is not quite intuitive. The interface should at least be improved by explicit statements which recommend that the user consult the Micropaedia entry for a short description of the topic and the Macropaedia entry for a long description, and suggest that the user can pick up an interesting subtopic from the list that follows.
The text search example suggests that answering natural language (English) questions is supported, which probably means promising a bit too much. With some testing one can find out that the search is based on the words in the search string, not on full grammatical and semantic analysis.
An annoying feature is that if there is just one match in the search, it is not displayed directly. The user sees a message which says there was one match and contains a link to it. Thus a search which is in a sense as succesful as a search can be (exactly one match) has the frustrating effect of forcing the user to do something that could easily be done by a computer. Usually just a few seconds of user's time are wasted, but the psychological effect might be important.
More generally, there is insufficient information about the search methods. The search reports are rather technical, and they do not describe the logic of searches. Especially an advanced user with difficult questions would appreciate a good explanation of the search logic, both in order to understand the limitations and in order to formulate the search expressions in a better way. For instance, is the order of words in a search string significant?
As regards to search reports, they emphasize the internal technical issues, not the user view. In particular, they report, for each word in the query, the technical processing of the words, instead of simply listing out the relevant words, ie those words that were actually used, as opposite to words which are effectively treated as grammatical noise.
Typically it seems to take 5 - 10 seconds to get a page from BO. Occasionally there longer delays. Just following links is not substantially faster than getting the result of a search. This obviously means that the search methods have been implemented efficiently.
Sometimes a search fails with a message reporting that the server could not be accessed. This means that the WWW client cannot connect to the server fast enough, ie a timeout occurs. This problem is related to WWW as a whole and normally caused by an overload of communication lines or the server. Howeve, it emphasizes the need for a distributed implementation of BO.
Britannica Online is a fully searchable and browsable collection of authoritative references, including Britannica's full encyclopædic database, Merriam-Webster's Collegiate Dictionary (Tenth Edition), the Britannica Book of the Year, and more.The phrase 'and more' suggests that the list is not exhaustive. Moreover, a normal user probably does not know what is 'Britannica's full encyclopædic database' and what are Propaedia, Micropaedia and Macropaedia which occur in documents in BO. Some conceptual model, an orientation basis, should be provided to assist people in understanding what there is in BO. It actually exists but under the somewhat misleading name Databases in Britannica Online, and it should be extended with more pragmatic issues such as how the databases relate to each other and which of them are suggested for various purposes.
By the way, the Book of the Year is the 1994 edition, describing events of 1993. In March 1995, one might expect to find (also) the Book of the Year 1995, or at least information about the date of its availability.
The basic search page does not explicitly state what is the collection of information (database) from which searches are made. Each user has to figure out by himself that probably just the encyclopædic databases are searched - but exactly what databases? To make a dictionary search, one has to explicitly select the dictionary, for example. This is possibly a good idea, since a user who really wants encyclopædic information may prefer getting a failure message to getting a dictionary entry containing no useful information.
However, it would be useful to provide, as an option, a search through all available databases, either sequentially or in parallel (in a manner similar to multithreaded query gateway).
There is a fundamental flaw in the manner in which search reports
are expressed. If one searches, for example, for the word
preface from the basic search form (with index search),
the report says
Britannica Online contains 1 item relevant to 'preface'.and if the search is made from the dictionary, the report says
Britannica Online contains 2 items relevant to 'preface'.although the the three items are all disjoint. This also implies that the system may report failure (0 items) even if BO does contain relevant items - in another database. Thus, the reports should explicitly refer to a particular database (or set of databases), not to Britannica Online as a whole. The information returned to the user should always indicate the database from which the information has been extracted. In addition to assisting the user, this would make it much easier to report errors in the information contents. (For instance, such reports in this document are not necessarily accurate enough, since I do not always know the database.)
The implementation of references (links) looks silly. For example:
WWW: see World Weather Watch [Cross ref]Here the words in brackets act as WWW links, instead of the much more natural and WWW-like style of making the words or expressions themselves into links. Assumably there have been some technical difficulties in converting the book form of Encyclopædia Britannica into hypertext.
Anyway, it is not comfortable to read something like
The order [Index] Anseriformes includes the well-known [Index] ducks, [Index] geese, and [Index] swans (family [Index] Anatidae) and the little-known [Index] screamers (family Anhimidae).A reader who does not, for the moment, care about links but just about the text itself finds it difficult to read. And when one wants to follow links, one often does not know what they relate to (eg in the first link above, does it relate to the biological concept order or to the particular order Anseriformes?). Still worse, the URLs are usually complicated and usually do not suggest too well what the document is about.
Such a document should of course contain pointers (links) to more detailed information, but it should be as such sufficient for the normal user for getting started. It should also be in a form suitable for printing on paper (ie preferably as a single HTML document) and declared public domain, so that subscriber organizations could make paper copies of it (or translations of it into various languages) to their members, to promote the use of BO.
I am looking for information about Eero Saarinen, the architect.
The simple, obvious method of index search with string
Eero Saarinen was succesful. However, I encountered
the annoying feature (explained later in some more detail) that BO
also offers me information about Eero Erkko and Eliel Saarinen.
Following the obvious link is an obvious thing to do, but I think I should not have been compelled to do it, since even a computer should be able to figure out that only one of the hits was a true hit. Now I see the following:
Saarinen, Eero (Am. arch.)
collaboration with
Eames
Roche
contribution to modern architecture
Gateway Arch
If I pretend to be an inexperienced user of BO, I cannot quite see
that the first line is a link to a (good) biography of Eero Saarinen,
with links to the topics mentioned above, among other things.
Perhaps this is just a matter of taste, but I would prefer getting directly to the biography and following the links there if I like. The current approach suggests that the only information about Eero Saarinen on BO is about his collaboration with Eames and Roche, his contribution to modern architecture, and Gateway Arch (which logically is part of the contribution, by the way).
Problems do not end here. In fact I was interested in the Gateway Arch. (It is a natural thing to expect that there is some picture of it, but that is a different issue.) It is mentioned at the end of the biography, and there is even a link to more information - at least that is what it looks like. In reality, following the link gives me
Jefferson National Expansion Memorial (St. Louis, Mo., U.S.) design by Saarinenwhich is something I had just read. Being an optimist, I assume the link "design by Saarinen" leads me to information about the design. In reality it gets me back to the end of Saarinen's biography.
Oh, but now I notice the symbol containing the words Next section. So the biography actually continues (but with no new information about the Arch). Perhaps I am stupid, but I really did not realize at first that the biography was divided into sections, linked together. Whether this is a good approach is debatable, especially because the division does not seem logical, and it is not even practical since the sections are longer than one page, so that the user must anyway know how to scroll within a section. It would be better to organize information like this into a single WWW document. (Larger documents should of course be organized hierarchically, with tables of contents, which actually seems to be the approach adopted.)
Now, I still hope to know more about the Arch, so I return to the page I originally got as response to the search, and I pick up the link to Gateway Arch there. I find myself at the end of some document. Knowing something about my WWW browser, I scroll upwards to see that the document is about St. Louis. There really is some additional information about the Arch in the document. Wanting still more, I look carefully at the text
stainless-steel [Index] Gateway Arch, designed by Eero [Index] Saarinenwhich promises me two links. The latter is probably to something I have already read, since I am able to guess that [Index] refers to an entry about Eero Saarinen, although it is funnily placed before his first and last name. The first link is intuitively less clear. I had learned that [Index] is often a link to information about something that is mentioned before it, and currently I am not so enthusiastic about stainless steel. But it turns out, as I had anticipated, that this time the link is to information about something mentioned after the code [Index]. What I get is
Gateway Arch (mon., St. Louis, Mo., U.S.) Saint Louis [Ref 1]; [Ref 2]This seems to lead me back to the article about St. Louis, which assumably mentions the Arch in two contexts. Fine, in a sense. But in those contexts there are links which are not natural cross-references within the same document but links to the above-mentioned page which in turn contains links to the two contexts. That is, yet another frustrating indirection. This approach is probably good when there is a large number of contexts to refer to, and perhaps it would take too much work to handle simpler cases in a simpler, more user-friendly way.
However, another link (a link to the entry labelled sweeteners) pointed to document with numerically different information: If sucrose is taken as a standard of 1, the sweetness of - - lactose is 0.27. This deviation is of course caused by an inconsistency in Encyclopædia Britannica itself, not by the search methods of BO. In fact, BO is a valuable tool in improving the contents of Encyclopædia Britannica, since using BO one can relatively easily access different Encyclopædia articles related to the same topic in the and find out inconsistencies.
Finland leads
to information about Finland in a nice (but not optimal) way.
It is easy to navigate to those issues about Finland which
one is interested in.
However, the information contents can be criticized. There are misspellings such as Helsingen Sanomat for Helsingin Sanomat. (This misspelling occurs not only in the title but also in the article itself, and the article also misspells the earlier name Päivälehti of the paper as Paivalehti. The selection of topics is not very balanced. It looks odd to a Finn that the only subtitle under the title communications is Helsingen Sanomat, or that the only social issues in Finland seem to be alcohol consumption and prohibition. It is difficult to say how the situation could be improved, but obviously the idea is to present links to special articles about some topics in a classified manner. Care should be taken to avoid the impression that the list of articles covers everything there is about the whole topic on BO.
There are even some plain errors. For instance, the Macropaedia entry for Finland says: independence was formally recognized by the Soviet Union in 1920. In fact it was recognized at the end of 1917 (and reconfirmed in the Tartu peace treaty in 1920), not by the Soviet Union which did not exist at that time but by Russia. (As the BO entry for the Soviet Union correctly describes, the Union was established in 1922. Whether the Union was a nation, as the explanation (hist. nation, Eurasia) suggests, is debatable.)
These remarks apply of course to Encyclopædia Britannica itself, not to BO in particular, but a user of BO inevitable judges the product on the basis of the correctness of its information contents. Moreover, the nature of BO would give an excellent opportunity to get feedback from readers, for instance in the form of suggested corrections by using WWW forms for the purpose.
However, there is a serious error in the information contents: The statement Almost half of the inhabitants are Swedish-speaking is utterly false. The percentage of Swedish-speaking inhabitants has been around 20 % for decades.
Olli Lounasmaa
(the name of a famous Finnish contemporary scientist) gives one
hit, a link named Joensuu.
Following that link, the user sees a text in which the string
Olli is highlighted.
The information (about the town of Joensuu) is correct, but it
is totally misleading that the search returns a link to it.
'Olli' is a common first name in Finland, and returning
a link to a document which happens to mention some Olli when
the search string was a name containing that first name is
quite unacceptable behaviour.
In general, the searches seem to be much too powerful in the sense that eg with a two-word search string a match is reported when there is match for one word only. Notice, in particular, that the first response to the query says:
Britannica Online contains 1 item relevant to 'Olli Lounasmaa'.without mentioning at all that a match was found for
Olli only.
(The detailed search report does describe the situation, although
not very clearly, but users normally don't consult such technical
reports unless they see an obvious reason to do so.)
Canary Islands
results in a long list of hits, obviously just because the
word islands appears in them.
The good point is that the real hit, for the Canary Islands, appears
on top - not as the first but as the second, the first one being
'Canary Islands chaffinch (bird)'!
However, returning irrelevant information in addition to relevant
is annoying,
when there is an obvious technical method to avoid it
(strongly prefer full match to partial match),
and it is probably a symptom of the same flaw as the
bad behaviour described above.
Admittedly the first hit in the list returned does contain the answer, and in the displayed excerpt from the document the word capital is in bold face.
Thus, text search seems to work reasonably well for simple questions, but the format of giving answers is far from optimal.
To solve the practical problem of producing vinegar from my own wine, I asked the question How can I make vinegar from wine? Quite obviously the question was ill-posed, relative to the search strategies of BO, since the first entries in its answer were aboutStudying the answers and the search report closely, I found out that BO had used the words I, make, vinegar, and wine in the search, ignoring other words. Paying attention to the pronoun I explains the strange search results.
In fact, in the light of my previous experiences with BO I had anticipated this problem, and actually I first used index search with the string vinegar. The entry for this topic was returned, and it indeed gives a good tutorial introduction to the process of making wine vinegar. I will try it.
The lesson is that index search can be much more efficient than text search, at least at the current stage of the development of text search strategies within BO.
WWW
gives a link to a short note which gives a link to
World Weather Watch.
A naive user might expect that BO, being implemented with the aid of
World Wide Web (WWW),
would contain some information about it.
On the other hand, there is a good (although short) article
about Internet. But in this context I noticed that obviously
some articles contain links to information about their authors.
In this case the link is named B.Ka. However,
following that link I get a page on which there is information
about several writers. I have to scan through it in order to find
the information about B.Ka. This is something that would better be
performed by a computer program, ie the an author link should
point directly to information about that person.
Admittedly, BO provides a method for restricting the search to those articles containing all the terms in the query. Such a method is useful but insufficient, since for complicated searches a hit for all terms is often very improbable.
In general, the approach adopted is a good compromise between readable and natural notation on one hand and manageability by WWW browsers on the other hand. However, as indicated above, there are some problems in presenting correctly e.g. the Scandinavian letter ä.
Moreover, there are deficiencies in presenting mathematical notations and deviations from the principles presented in the FAQ. For instance, the entry for ampere contains the notation
2 {times} 10{sup -7} newton
which is almost unreadable, or at least requires imagination.
A better approach for such notations would be to present them
using the clumsy but rather generally understood linearized notation like
2*10**(-7) newtonor, in this particular context, 200 nanonewton or 0.2 micronewton.
Notice that the SI system recommends that when using an exponent notation, the exponent of 10 should be evenly divisible by 3. This principle is logically compatible with the use of prefixes like k, m, M, (indicating multiplication by a power of ten with an exponent divisible by 3), but it is not generally applied in the representation on BO.
A similar problem is the notation
4{degree} C
in the Macropaedia entry for Metric system.
If one wishes to avoid using the notation 4° C (because
it may be displayed incorrectly by browsers), it is much better
to write it without abbreviations, ie 4 degrees Celsius, than to
use an ad hoc notation which indicates an abbreviation.
By law, no part of this work may be reproduced or utilized in any form, except for copying of brief excerpts as permitted under U.S. copyright law.This is a tricky legal issue. There are different copyright laws in different countries, and normally an act is to be judged according to the laws of the country where the act was taken. For instance, the Finnish copyright law does not limit the right to quote published works to "brief excerpts" but on other grounds.
It can be argued that the existence of the copyright notice constitutes an agreement, ie that the user commits himself to obeying U.S. legislation in this context and a violation could be treated as violation of that agreement. The tricky point is that signed agreements are not made by end users but by an institution, which cannot control the acts of its members in this respect.
The pragmatic side of the issue is that information searches are made for various purposes, including purposes for which it is essential to be able to quote parts of the information. The question arises which institution decides the extent of allowable quotations and on what grounds.
Subscribers only area,
after the arrangements made for the test period, and after having
accessed that area succesfully, I got (when using X-Mosaic under Unix)
the following error message:
Britannica Online Home Page Unauthorized Access You have accessed a hypertext link to the Encyclopædia Britannica proprietary database from bastion.eb.com (198.242.219.5). This host has not been enabled for access.This error was intermittent; subsequent accesses were succesful.
The beta version 1.1 looks quite different from the current "normal" version. It is difficult to say to what extent the changes are improvements. In general, I think that frequently used user interfaces should be kept stabile, making changes only if they are definite and considerable improvements. However, the customer base of BO is probably not very wide yet, so some experimentation is understandable at this stage.
The basic difference seems to be that instead of two search methods there is, superficially, only one. The difference between index search and text search is probably now implemented in the selection of reference in the unified search scheme. I am inclined to think that this is a better approach, but once more I emphasize the need for better user documentation. A beginning user should be assisted so that he gets very early an idea of the difference between index search and text (or free) search.
There is quite a lot of development work to be done, and apparently it is being done, but even in its present state Britannica Online can significantly improve the productivity of teachers, research workers, other staff, and students at universities.