How to limit the number of characters entered in a textarea
in an HTML form
and other notes on textarea elements

Especially in applications where data is entered to a database or stored onto disk on the server, some limitations must be imposed on the amount of data. It would be nasty if a database crashed or a disk got filled with terabytes of data, sent by some user out of ignorance, mistake, or malevolence.

But in HTML, there is no way limit the number of characters entered by the user in a textarea element. Browsers may impose some limitations, but they are a problem, not a solution. The server-side script that handles the form submission needs to check against excessive amount of input. Client-side scripting in JavaScript can be used for auxiliary checks in order to give the user faster feedback when he tries to exceed a limit. This document describes briefly both simple JavaScript checking on form submission and more real-time checking based on counting characters as they are typed.

The meanings of the rows and cols attributes

In an HTML form, a textarea element specifies a text input area. The author can, and indeed must, suggest the visible size of the area, using the rows and cols attributes. But these should not be taken as limitations to the amount of data. The HTML specifications have explicitly said that from the very beginning; the HTML 2.0 specification said: "HTML user agents should allow text to extend beyond these limits by scrolling as needed", and the current specification, HTML 4.01, repeats this more verbosely:

rows = number [CN]
This attribute specifies the number of visible text lines. Users should be able to enter more lines than this, so user agents should provide some means to scroll through the contents of the control when the contents extend beyond the visible area.
cols = number [CN]
This attribute specifies the visible width in average character widths. Users should be able to enter longer lines than this, so user agents should provide some means to scroll through the contents of the control when the contents extend beyond the visible area. User agents may wrap visible text lines to keep long lines visible without the need for scrolling.

Implementations, especially wrapping

All browsers seem to allow the input of an unlimited number of lines in a textarea, except that there can be a browser-specific limit on the total number of characters.

However, several browsers (e.g. Internet Explorer and Opera) violate the above-cited principle on line length. Instead of allowing users to scroll horizontally, they "soft wrap": when line length would exceed the visible width, they visually display the text in two or more lines. The actual data sent by the browser does not contain line breaks in such positions where the browser has "softly" broken a line; only actual line breaks entered by the user are included into the data. This means confusion: the user sees the text in lines without knowing how it will actually be sent. Sometimes this doesn't matter, sometimes it does.

Some Web authors regard Netscape 4's default behavior - horizontal scrolling when needed - as a problem. This typically indicates that the author is trying to use a textarea element for something else than for data input. The Netscape 4 behavior in this respect surely complies with the specifications. Whether IE behavior is a bug or just poor quality is debatable.

By using the attribute wrap="off" in the textarea tag one can make Internet Explorer behave according to the specifications, without affecting other browsers. It is of course stupid that we need to use a nonstandard attribute to make a browser behave in a standard manner, but what can you do? (Unfortunately, there does not seem to be any way to make Opera behave in this respect; it ignores the wrap attribute.)

Using Cascading Style Sheets (CSS), you can achieve the same effect with white-space: nowrap; overflow: auto;. Thus, the wrap attribute can be regarded as outdated.

The wrap attribute is a horrendous kludge in other ways too. It is not in any HTML specifications, but it is recognized by (sufficiently new versions of) IE and Netscape as follows:

The wrap attribute for textarea on IE and Netscape
attribute wrapping behavior note
wrap="off" no wrapping; horizontal scrolling if needed default on Netscape 4
wrap="soft" "soft" wrapping: the browser divides the text into lines to make it fit horizontally but does not thereby introduce actual line breaks into the data default on IE and Netscape 6 (a mistake)
wrap="hard" "hard" wrapping: the browser divides the text into lines to make it fit and thereby introduces actual line breaks into the data Not supported by Netscape 6.

Stephanos Piperoglou's generally excellent review HTML 4.0 in Netscape and Explorer says, in section Forms, that IE does not support values soft and hard but recognizes virtual and physical instead. This does not seem to be correct. Miko O'Sullivan's detailed Mikodocs Guide to HTML says, in the description of wrap for textarea: "You may from time to time see other variations on WRAP, such as VIRTUAL or PHYSICAL. Netscape introduced these attributes a few years ago as proposed extensions to HTML 3.0, then abandoned them." It's very hard to say how different versions of Netscape, on different platforms, actually behave. My tests with Netscape 4.04 on WinNT suggest that the default is (correctly) no wrapping, the value off just confirms the default, hard and soft work as described above, but any other value is equivalent to soft! Naturally a browser should ignore an attribute setting it does not recognize, such as wrap="foobar", but Netscape treats it as equivalent to the non-default setting wrap="soft".

Thus, wrap="off" might be useful since it overrides the wrong default (wrap="soft") on IE. It is questionable whether the other values should ever be used.

Although "hard" wrapping might appear to be a way to limit the line length (to the value specified by the cols attribute), this is not a reliable method. The wrap attribute is poorly documented, probably does not have any effect on most other browsers than IE and Netscape; and it may behave differently even on different versions of IE and Netscape. You still need server-side checking (or other processing of too long lines) if it is essential that lines not exceed a limit you need to set. And users might be accustomed to "soft" wrapping (which is, after all, the default on IE) might easily be lured into thinking that their text just "soft wraps", without realizing that their lines will actually be sent as broken.

Wrapping implies some potentially very nasty effects. When wrapping, either "soft" or "hard", is applied, the browsers (Netscape, IE, Opera) basically break between "words", i.e. strings separated by spaces. This is natural and acceptable, assuming that wrapping is acceptable at all. But if a "word" is longer than the textarea width, as set by the cols attribute, the browsers will split it! You can probably see this below, where we have a few textareas with prefilled content "Supercalifragilisticexpialidocious." as one line but with cols set to 20. Your browser probably splits at least some of the strings to two lines.

No wrap attribute:

With wrap="off":

With wrap="hard":

Such splitting can be disastrous if the user wants to type a URL into a textarea, especially if "hard" wrapping is on, since the URL will actually be split into pieces. Even with "soft" wrapping the user will be confused by the apparent splitting; how can he know whether it will actually be sent that way? (You can hardly expect a normal user to peek at the HTML source and consult references which say what it causes and figure out which reference gets it right.)

As if this weren't enough trouble, IE takes liberties in splitting words and "words". Just as it splits a word containing a hyphen after the hyphen when formatting normal data for display, it will break foo-bar to foo- and bar in textarea input if the first part still fits into the line but the second part won't. It also breaks after several special characters, which is harmful for URLs and other strings like foobar%zap especially when hard wrapping is in effect.

Since Netscape 6 seems to have soft wrapping on by default, and it does not even honor wrap="off", and since IE 6 and Opera 6 keep wrapping too should we deduce that the original intended processing of textareas has been replaced by a "de facto" browser standard? Maybe, but that's very unfortunate.

Background: the two models of text input

The approach described above has been criticized for not being user-friendly. And there's certainly a point in the note that the "HTML way" of text input does not correspond to the intuitive expectations of people acquainted with text processing programs. A Usenet article by Simon Brooke summarizes this well:

Textareas are for input of larger amounts of text. Sometimes this text necessarily has arbitrarily long lines. Very often it doesn't. Naive users, or users carrying expectations over from other software, become confused and disoriented either when the caret goes out of the viewport, or the viewport scrolls laterally. For these users, the 'valid' form of the textarea widget is a user-hostile control.

The fundamental problem here is that there are two different mental models (and corresponding implementations) of "typing text". The first one, the older one, is based on explicit line breaks entered by the user. It corresponds to typing on a typewriter, and it is common among programmers, and it's also the model on which several Internet protocols (like E-mail and Usenet protocols) are based. The second one, now more common among "ordinary users", was introduced by text processing programs (as opposite to text editors), and it means that the user need not, and normally should not, hit Enter or Return but just watch the program divide the text into lines. - Enter or Return generally means end of paragraph in this model.

Quite some confusion has arisen when the two models, or conventions, have been used in the same environment without any conventions and arrangements for conversions. The confusion is described somewhat more technically at the end of an otherwise all-too-technical Unicode report on newline guidelines.

The HTML specifications are clearly based on the first mental model as regards to textarea. Quite a few browsers have implemented textarea more or less according to the "text processing" model, at least optionally. And it's optional in the sense of being, to some extent, settable by the author.

This adds confusion to confusion. If the textarea element were specified more flexibly, it could be a browser option whether it works in "typewrite mode" or "text processing mode". Each user could then select the method he is familiar with, or just prefers. But now users have to guess how each textarea works; you can't see it without trying, or peeking at the HTML markup.

Since both browser behavior and user behavior (i.e., users' understanding on how their input will be handled, by the browser and by the server, if they can tell the difference) varies, you cannot really know much of the intended newlines in input. You can't even know, in general, whether they were entered by the user or inserted by the "friendly" browser. So if it's some text that might logically consist of paragraphs, you can't recognize paragraphs without special conventions. The best workaround is probably an explicit statement "please use an empty line between paragraphs", if you wish to be able to recognize paragraphs, as you probably do quite often, even for a simple guestbook application.

Limitations imposed by browsers

Some browsers impose some limits on the amount of data that can be entered in a textarea. Limits like 32 or 64 kilobytes (32,768 or 65,536 characters) have been observed. Such limits, if they exist, are caused by simplistic implementations, and they are independent of the values of the rows and cols attributes.

Such limitations cannot be taken as solutions to the problem of limiting textarea input size. They are just browser-specific limitations (which shouldn't really exist, and will hopefully be removed in new versions). Instead, they constitute a problem.

It is unlikely that a user takes the trouble of typing more than 32,768 characters into a textarea when filling out a form. Browsers' user interfaces for such purposes are generally very poor, with extremely limited editing capabilities. But a user might cut and paste some long text which he has typed using an editor or a text processing program.

If you expect that some users might wish to include very long texts when filling out a form of yours, consider making it possible to use alternative methods of data submission. Depending on the case, this might mean one or more of the following:

About textarea vs. input type="text"

Basically, a textarea is for unlimited, usually multi-line input of text, whereas input type="text" is for single-line input.

For input type="text", we can use the size attribute to specify the visible size of the field, in characters. But we can also use the maxlength attribute to specify the maximum amount of characters that can be entered. Browsers generally enforce such a limit. However, an author should still assume that the limit can be exceeded and test things server side (as explained below); for reasons to this "paranoia", see some words of warning (in How can I make a field readonly in a Web form?).

Thus, for small amounts of user input, you could use one or more input type="file" elements instead of a textarea element. If you need to include several single-line input fields, note that the maxlength attributes set a separate limit on each field, and note that the user cannot simply continue typing or press enter after typing a line and wishing to continue. (He can use tabbing, though.) In fact, relatively often pressing enter in a single-line input field submits the form! So this approach is a bit problematic. (There's of course the additional problem that the server-side script needs to process all the single-line input fields, perhaps concatenating them into one string, adding line breaks or spaces between the values.)

The following form lets you test how the idea works on your browser.

Please enter data (at most 14 characters per line):



Server-side checks

Generally, the server-side script that handles a form submission should perform data consistency and acceptability checks on the form data before doing anything else.

At the simplest, the form handler could first just look at the Content-Length header in HTTP headers and discard the submission (politely, perhaps), if there is no such header or its value is larger than some limit. But you would still need the code for actually processing the data when its size is acceptable, and check the amount of data - since a cracker could have faked Content-Length: 42 and still send you megabytes of junk.

In such checks, checking for the amount of data entered in a textarea is usually rather simple. At the simplest, you just get the length of the data in characters and compare it against a limit. The implementation depends on the server-side interface technology (CGI, ASP, something else) and on the programming or scripting language used (Perl, C, C++, sh, whatever).

As a very simple illustration, consider the following form:

<form action=
"http://www.cs.tut.fi/cgi-bin/run/~jkorpela/chkarea.pl"
method="post">
Please enter data, at most 42 characters:<br>
<textarea name="box" rows="5" cols="30">
</textarea>
<br><input type="submit">
</form>
Please enter data, at most 42 characters:

The script that handles the submission just checks the amount of data corresponding to the textarea. In a real-life situation, this would be preliminary to any further processing. In a CGI script written in Perl, using the CGI.pm module, the code is essentially the following:

 if(!defined($query->param('box'))) {
   print "No data included under the expected name - submission rejected.\n"; }
 elsif(length($query->param('box')) > $limit) {
   print "Too much data!\n"; }
 else {
   print "The data was virtually accepted."; }
 

The following form is identical to the one above except for the action attribute, which here points to a script which sends back a copy of the form to be fixed and resubmitted, if there is too much data:

Please enter data, at most 42 characters:

The CGI.pm module contains handy tools for creating such forms which contain, as prefilled data, user input from a previous form submission, optionally after some editing.

Note: A line break in a textarea counts as two characters. The reason is that it is presented, in the data, as two control codes ("control characters"), namely carriage return (CR) and linefeed (LF). Reference: HTML 4.01 Specification, section Form content types.

The statement above applies to "hard returns" which are actually sent by the browser as part of the data, as opposite to eventual "soft returns", i.e. browsers just visually displaying the data to the user. See Implementations, especially wrapping above.

Helping users with JavaScript checks

It is possible to write simple (or complicated) client-side scripting code in JavaScript in order to help the user to stay within the given limits. This could be based on checking the amount of data (entered in a textarea) when the user is about to submit the form, or to move to the next field in the form, or even "real-time" as he types the text.

Since one cannot rely on JavaScript being enabled, the client-side checks should be regarded as extra convenience only, to those users who can and wish to make use of it.

Simple JavaScript check

At the simplest, you could use just an onsubmit attribute in the form tag, containing JavaScript code like the following (for our sample form discussed above, with name="ourform" attached to it):

 onsubmit = "return ok(42);" 
 
with ok() defined as
 function ok(maxchars) {
 if(document.ourform.box.value.length > maxchars) {
   alert('Too much data in the text box! Please remove '+
    (document.ourform.box.value.length - maxchars)+ ' characters');
   return false; }
 else
   return true; }
 

The return false statement means that normal form submission does not take place. In practice, you would probably want to make that code a function

The following sample form uses this technique, so you can test it if you have JavaScript enabled:

Please enter data, at most 42 characters:

It would be possible to perform such checks when the user leaves a textarea e.g. by tabbing to the next field or clicking on another field. One could use the onblur attribute then, or onfocus attributes for other fields. Such "intermediate" solutions (as opposite to checking on submit or checking while typing) could be especially useful when there are several textarea fields in the form and you wish to try to give immediate feedback when a limit is about to be exceeded. The following JavaScript code, to be used e.g. in
<textarea name="box_name" onchange="maxlength('box_name', 42)" ...>
and assuming the form is named pooh, was suggested by Oliver Tickell:

 function maxlength(element, maxvalue)
     {
     var q = eval("document.pooh."+element+".value.length");
     var r = q - maxvalue;
     var msg = "Sorry, you have input "+q+" characters into the "+
       "text area box you just completed. It can return no more than "+
       maxvalue+" characters to be processed. Please abbreviate "+
       "your text by at least "+r+" characters";
     if (q > maxvalue) alert(msg);
     }
 

Here's the code in action (note that with JavaScript enabled, you'll have the input volume checked as soon as you leave the textarea field e.g. by tabbing):


"Real-time" JavaScript check

In order to count characters as the user types them, we need JavaScript 1.2 features, so this enhancement won't work on all JavaScript-enabled browsers. (See Events and Event Handlers by Martin Webb for information on support to event handlers in different JavaScript implementations.)

Our approach is to use the onkeyup attribute for the textarea element and associate some checking code with it. Specifically, the code updates a text field displaying the length of the value of the textarea field, i.e. amount of characters entered. It also checks that value against the given limit. There are different things that we could do when the limit is exceeded. One approach is to display an alert message. We can also add code that changes the counter display to red and bold, though the features needed for this currently work on IE 4+ only; but it's only an additional hint. This way, the user can continue typing and later delete something from the area to get below the limit. Sample code for the routine to be invoked via the onkeyup attribute is:

function update() {
   var old = document.f.counter.value;
   document.f.counter.value=document.f.box.value.length;
   if(document.f.counter.value > limit && old <= limit) {
     alert('Too much data in the text box!');
     if(document.styleSheets) {
       document.f.counter.style.fontWeight = 'bold';
       document.f.counter.style.color = '#ff0000'; } }
   else if(document.f.counter.value <= limit && old > limit
	   && document.styleSheets ) {
       document.f.counter.style.fontWeight = 'normal';
       document.f.counter.style.color = '#000000'; } 
   }

The following form uses this technique; for testing purposes, the textarea size limit is set to a ridiculously small value (eight characters). It has been written (using a nonscript element) so that when JavaScript is not enabled at all, a message explaining what could be achieved by using a JavaScript-enabled browser:

(If you used a JavaScript enabled browser, preferably supporting JavaScript version 1.2 or equivalent (as supported e.g. by Internet Explorer 4 and Netscape Navigator 4), you would have some help from the browser in trying to remain within the limit.)
And for JavaScript-enabled browsers not supporting JavaScript 1.2 we have just the simple checking on submit in operation. To avoid confusing users, we won't include the counter field when it doesn't work. This is achieved by generating the markup for it dynamically, with code that should get executed by JavaScript 1.2 capable browsers only. Unfortunately the method for this, the inclusion of language="JavaScript1.2" into the script element, does not seem to work on Opera, i.e. Opera users with JavaScript enabled will see a counter field which doesn't work.

Please enter your message. The maximum number of characters allowed is 8, counting each end of line as two characters. If you fill out and submit this form repeatedly, you may wish to it before starting to type a new message.