VERIFICATION OF THE BEHAVIOUR OF REACTIVE SOFTWARE WITH CFFD-SEMANTICS AND ARA TOOLS

A. Valmari¹ and R. Savola²

1) Tampere University of Technology, Software Systems Laboratory, P. O. Box 553, FIN-33101 Tampere, Finland, ava@cs.tut.fi

2) VTT Electronics, P. O. Box 1100, FIN-90571 Oulu, Finland
(currently with Elektrobit Ltd., Tutkijantie 8, FIN-90570 Oulu, Finland, Reijo.Savola@elektrobit.fi)

Copyright Notice

This paper has been published in

Proceedings of an International Symposium on On-board Real-time Software, ESTEC, Noordwijk, The Netherlands, 13-15 November 1995, ESA SP-375, January 1996, pp. 173-180.

and was made available in this WWW page with the permission of ESA. Some changes were made to this WWW-version. Excluding changes in formatting, the changes are shown by "[WWW note: . . .]".

ABSTRACT

An unusually user-friendly formal approach to the validation and verification of embedded software is described. It is based on a process-algebraic semantic model known as CFFD-semantics, and is supported by a set of automatic tools. An important feature of the method is a technique for the simultaneous visualisation of information about all possible executions of the system. This kind of visualisation makes it easy to detect haphazard concurrency errors such as deadlocks and duplication of messages. CFFD-semantics was chosen, because it is a good compromise between expressive power and ability to visualise. The article introduces process-algebraic verification in general, explains CFFD-semantics and its significance, describes the visualisation technique, discusses tool support, and gives an example.

Keywords: validation, verification, concurrency, LTS, Lotos

1. INTRODUCTION

The software within computer-controlled electronic or mechanical systems is often called embedded software. Embedded software is typically reactive and concurrent. A system is reactive, if it is in continuous interaction with its environment. A concurrent system consists of several co-operating autonomous units, which are often called tasks or processes. It is difficult to specify how a reactive system should behave. It is even more difficult to ensure that a concurrent system behaves as required by the specification, because concurrency introduces nondeterminism; that is, given the same input and initial state, the system may produce several different outputs. Consequently, reliable embedded software is difficult to design.

The difficulties in the design of reactive and concurrent software have led to the development of several formal specification, analysis and verification methods. With them, the required behaviour of a system can be rigorously specified, and properties of a design can be analysed even up to the point of showing that the design is correct with respect to a formal specification. Formal methods are mostly in the basic research stage, although some non-trivial experimental applications have been reported in the literature (e.g. [3]). Formal methods typically apply only to greatly simplified models of systems, require lots of human and / or computer resources, and can be used only by mathematically talented persons. On the other hand, the kind of certainty they provide regarding the correctness of the behaviour of the system cannot be obtained by any other means.

This article describes an unusually user-friendly formal approach to the validation and verification of reactive and / or concurrent software. The approach is based on a process-algebraic semantic model known as CFFD-semantics [16, 17], and is supported by Ara Tools (Advanced Reachability Analysis Tool Set) [10, 14]. It has lots in common with many other approaches and tools based on the process algebraic tradition (see e.g. [6]). The novelties of our approach are the use of a semantic model which is particularly well-suited for verification tasks [12]; and a technique for the visualisation of even the nondeterministic aspects of the global behaviour of a system, which makes our approach more amenable to users with little or no mathematical background. An earlier introduction of our approach was given in [11].

The basic ideas of process algebraic verification are described in Section 2. Our approach is introduced in Section 3, and Ara Tools are discussed in Section 4. Section 5 consists of an annotated analysis example, and Section 6 contains our conclusions.

2. PROCESS ALGEBRAIC VERIFICATION

2.1 Systems and Processes

In the process algebraic tradition, a system is thought of as a box with a number of gates. The system communicates with its environment by executing actions in the gates. The execution of an action is synchronous. That is, an action can be executed if and only if both the system and its environment are willing to execute it, and both the system and the environment participate in the execution of the action. As such, gates and actions do not involve any notion of direction in the sense of one of the partners doing "output" and the other doing "input", but flow of data from one partner to the other may be specified by adding parameters (often called attributes) to an action. Actually, many process algebras allow also synchronisation by more than two partners, and data parameters without clear notion of direction, but we omit such subtleties in this representation.

To give an example, Fig. 1 shows a communication protocol together with its environment. The environment consists of two processes, Source and Sink. We assume that Source and Sink are physically separated, so that establishing reliable communication between them is a non-trivial problem. In this example, the purpose of AB-Protocol is to reliably deliver messages from Source to Sink. The gate send of AB-Protocol is for accepting messages from Source, rec for delivering the messages to Sink, and with ok and err AB-Protocol can inform Source whether the transmission was successful. We have denoted the direction of the flow of messages by arrowheads in Fig. 1. No data is associated with the actions ok and err; therefore, the corresponding gates are shown without arrowheads. (Alternatively, we could have had only one gate report instead of ok and err. Then success / failure information would have been given as a parameter of each report-action.)

Of course, AB-Protocol has some internal structure. We assume that it consists of a sender process located at the same site as Source, a receiver process located near Sink, and an unreliable bidirectional channel connecting the two sites. AB-Protocol works by sending data and acknowledgement messages to (gates dats and acks) and receiving them from (datr, ackr) the channel. This structure is shown in Fig. 2.

To complete the description of AB-Protocol, it is necessary to specify the order in which actions are performed in gates. For the sake of an example, we have done that for Receiver in Fig. 3 using a finite automaton augmented with variables. We assume that AB-Protocol works according to the well-known alternating bit principle, where, in order to distinguish new messages from retransmissions of earlier messages, each message is augmented by a sequence number 0 or 1 [1]. Receiver keeps the sequence number it expects next in the variable "sn". It starts by inputting from Channel a packet containing a data message and a sequence number, and stores the message into the variable "msg" and sequence number to "sr". Then it checks whether the sequence number was what it expected, and, according to the result, either delivers the message to the environment (i.e. to Sink) by action rec!msg and sends an acknowledgement to Sender through Channel, or just sends the acknowledgement. If the sequence number of the incoming message was correct, Receiver also swaps its expected sequence number from zero to one or vice versa.

Depending on our point of view, we can consider each of Sender, Receiver and Channel as a system of its own right, and investigate its behaviour using process algebraic tools. Alternatively, we can consider the parallel composition of the three as one system (or one process). We have already given a name for that process, namely "AB-Protocol". Process algebras contain rules with which the behaviour of this kind of a composite process can be computed from the behaviours of its components.

2.2 Abstract Semantic Models

A key feature of process algebras is a two-step abstraction mechanism, with which it is possible to concentrate attention to the externally observable features of a system, and ignore totally its internal structure. At the first step, the actions of the system are divided to visible (or external) and invisible (or internal), depending on whether the environment is supposed to see them (or to be able to synchronise with them, to be more precise). The visible actions of AB-Protocol are those which occur in gates send, ok, err and rec, and its invisible actions are dats, datr, acks and ackr together with all internal actions of its component processes Sender, Receiver and Channel. Invisible actions are often denoted by the special symbol "tau" [WWW note: denoted by tau in the sequel].

The second step of the abstraction mechanism is the representation of the behaviour of the system in a form where invisible actions are not shown. From this kind of a representation it is possible to check, for instance, whether each rec-action is preceded by a send-action, but it is not possible to check how many invisible actions (such as dats) occurred between the send and rec. Typically these representation forms, or abstract semantic models, preserve at least all sequences of visible actions which may arise from different executions of the system. Such sequences are known as traces.

The traces of a system are obviously an important aspect of its interface with the environment, because they determine all the orderings in which the system is willing to execute visible actions. The trace actually executed when the system is started in some environment is chosen from among the traces of the system partly by the environment and partly by nondeterminism or "chance". It is important to understand that the state of a system is not necessarily fully determined by the trace it has executed, but it may depend also on invisible actions and on details which are not shown at all in the specification. For instance, the choice between executing ok or err by AB-Protocol depends on whether the most recent message managed to get through the unreliable channel. It is not, however, reasonable (or even possible) to show in detail in the specification the exact conditions in which messages get lost. Therefore, at the level of the specification, the choice between losing or not losing a message is just a nondeterministic choice.

Because the behaviour of a system may be nondeterministic, there are important behavioural properties which are not shown by its traces; ability to deadlock or livelock, for instance. There is no obvious answer to the question of what properties an abstract semantic model should preserve in addition to traces (and, as we will see in Section 3.1, there is an important model which does not even preserve the trace information fully). Furthermore, in order to be useful for verification, a semantic model has to have certain mathematical properties. Often they cannot be established without making some compromises. These facts have led to the development of several families of semantic models containing altogether literally hundreds of members.

2.3 Analysis and Verification

An abstract semantic model makes it possible to analyse the behaviour of a system in various ways. Most importantly, one may check whether the abstracted behaviours of two systems are the same; that is, whether the systems are behaviourally equivalent according to the model. For instance, one can check - at least in principle - whether the AB-Protocol partly shown in Figures 2 and 3 produces the service shown in Fig. 4 by checking if their abstracted behaviours are the same. In this way it is possible to verify that a protocol design conforms exactly to a service specification. (Fig. 4 is grossly over-simplified, but it can be used to illustrate the idea.)

Abstract semantic models can be used also for simplifying the behaviour of a system before analysis. This is particularly useful when the system consists of several processes connected in parallel, because an explicit representation of the total behaviour of such a system is usually huge.

There are also abstract orderings, with which it can be checked whether a system "implements" a specification in the sense of behaving in the same way, or "better". For instance, a protocol using a reliable channel perhaps never executes the error indication action err. Thus it cannot be behaviourally equivalent to, but it may well be "better" than the specification in Fig. 4 which contains err-actions.

Many of the above-mentioned verification techniques can be automated. Although process algebraic theories allow data parameters in actions and manipulation of data within processes, automatic verification almost invariably fails if the system under verification stores more than a very small amount of data. This is because verification is computationally very demanding in general, and the handling of large amounts of intra-system data during process algebraic verification is extremely difficult. As a consequence, automatic tools allow only limited ways of representing and manipulating data within processes. For instance, when using automatic tools, the variable "msg" in Figures 3 and 4 should be dropped, or its allowed range should be limited to two or three values. On the other hand, in Fig. 3, the variables "sn" and "sr" storing the expected and received sequence numbers cannot be dropped, because their contents determine whether the action rec should be executed or not. Fortunately, their range is very small.

Process algebraic verification tools usually represent the behaviour of a process as a labelled transition system (LTS). LTSs resemble Figures 3 and 4, but they do not contain variables. Manipulation of data is modelled by the structure of the LTS, and only constants are used for parameters of actions. Fig. 5 shows the behaviour of Receiver in LTS form with the actual messages omitted.

When using abstract semantic models, it is important to keep in mind that the results are valid only up to the extent the model preserves the behavioural properties of interest.

3. OUR APPROACH

The verification approach we are advocating in this paper is based on the process algebraic techniques discussed in Section 2. Two features distinguish it from other computer-aided process-algebraic verification approaches: the choice of the abstract semantic model, and the use of LTS visualisation as an easy way of getting general information about the behaviour of a nondeterministic system.

3.1 The CFFD Semantic Model

As we already mentioned, there are literally hundreds of abstract semantic models to choose from, and the choice affects the properties that can be verified. One might think that it would be best to choose the model that preserves the most behavioural properties, but this is not the case. If the model preserves irrelevant aspects of behaviour, then the verification of behavioural equivalence of two systems may fail because of some irrelevant difference noticed by the model. Furthermore, analysis of a system may fail because irrelevant details preserved by the model hide the essentials. Therefore, it is better to choose a model which does not preserve unnecessary behavioural features.

We have chosen to use the Chaos-Free Failures Divergences (CFFD) model [16, 17]. The CFFD model preserves full information on traces, traces leading to a deadlock, and traces leading to a livelock (so-called divergence traces; by "livelock" we mean a never-terminating sequence of invisible actions). Furthermore, the CFFD model preserves full information on so-called stable failures. After any trace, the visible action to be executed next is determined both by the system and by its environment. If the system has not livelocked, and there is no visible action such that both the system and the environment are willing to execute it, then the system deadlocks. Because of nondeterminism, the set of actions a system is willing to do after some trace is not necessarily unique. Therefore, to determine whether the system may deadlock in some environment, it is necessary to know, for each trace, the minimal sets of actions that the system may be willing to do in a non-livelock situation after the trace. Stable failures record, in essence, this information. (To ensure that the CFFD model has all the mathematical properties needed in the contexts where we use it, it preserves also the one bit of information telling whether the first action of the system may be invisible.)

The CFFD model resembles the well-established CSP model [5], but, unlike the latter, does not suffer from the "chaotic divergences" problem. In essence, the CSP model considers livelocks as errors. It does not preserve any information whatsoever about the behaviour of a system which has executed a divergence trace, although it may be possible that the system did not actually enter the livelock or managed to escape from it. In the CSP world, any process that has executed a divergence trace is called "Chaos". In the CFFD model, on the other hand, systems do not collapse to "Chaos" after executing a divergence trace. Therefore, the CFFD model is more useful in the presence of livelocks than the CSP model.

The relative merits and dismerits of the CFFD model compared to many other models are discussed in detail in [12, 17].

3.2 LTS Visualisation

By LTS visualisation we mean the representation of an LTS in a graphical form on a computer screen or paper for inspection by humans. With current lay-out algorithms based on the principle of simulated annealing, it is possible to generate automatically reasonably readable graphical representations of LTSs with two or three dozens of states.

Of course, LTSs of typical systems contain many, many more than three dozens of states; even three million states is often an underestimate. But here the abstraction mechanisms provided by process-algebraic theories are of great help. Although AB-Protocol has a complicated internal structure, it has only four gates. We will see in Section 5 that the CFFD-abstracted behaviour of AB-Protocol obtained when only the actions send, rec, ok and err are visible is actually simple. Furthermore, for many abstract semantic models there is a reduction algorithm which, given an LTS, generates another LTS, which has the same abstracted behaviour as the input LTS, but is often much smaller. One such algorithm for the CFFD model is described in [16]. Therefore, instead of visualising the LTS of the system, one may visualise the LTS produced from it by a reduction algorithm.

If the system has too many gates, then it is not possible to obtain a readable visualised LTS even by reduction. But, even then, one may obtain useful information by making some originally visible actions invisible before reduction and visualisation. Actually, one may check several properties of a system one at a time, by repeating the procedure with different subsets of visible actions hidden. This idea was used in [15] for the verification of Peterson's well-known mutual exclusion protocol [9]. The mutual exclusion property was verified with one choice of visible actions, yielding an LTS with four states and five transitions. Another set of visible actions led to an LTS with three states and four transitions, from which the eventual access property was easy to check. The case study contained also experimentation with incorrect versions of the protocol. The biggest LTS encountered contained 7 states and 11 transitions.

The most attracting feature of LTS visualisation is that it makes it possible to get easily feedback from the global behaviour of a system. By "global" we mean that a visualised LTS shows simultaneously information about all possible executions of the system, not just one execution. In this respect it is very different from simulation, animation and ordinary testing. For instance, a nondeterministically reachable deadlock is not revealed in a test, if the system does not make the choices necessary to reach the deadlock. This is the main reason why deadlocks and other concurrency-oriented errors are so difficult to detect with testing. On the other hand, deadlocks are immediately apparent in visualised LTSs as states without outgoing transitions.

LTS visualisation is easy to use, because it does not require the user to specify a set of verification questions or a reference LTS for LTS comparison. Instead, answers to many verification questions are obtained by first choosing a suitable set of visible actions, then generating, reducing and visualising the corresponding LTS, and finally investigating the result. Many properties of systems can be easily checked from visualised LTSs. Furthermore, a visualised LTS may reveal unexpected behaviour of the system under verification. In this way it may draw the attention of a system designer to a problem which he perhaps would not otherwise had ever even thought of. We will illustrate this in Section 5.

The above-mentioned features of LTS visualisation make it useful for obtaining feedback from tentative designs. The idea of using visualisation as a design aid was developed in some detail in [13].

It is important for LTS visualisation that the reduced LTSs can be made reasonably small. In our experience, this is often the case when the CFFD model is used. In [15], the verification of Peterson's algorithm was repeated with another semantic model, namely a divergence-preserving variant [4] of Milner's observation equivalence [8]. The reduced LTS for mutual exclusion grew to 33 states and 64 transitions, and the LTS for eventual access grew to 10 states and 16 transitions. These figures suggest that it is crucial for LTS visualisation to use a model which does not preserve much more information than the CFFD model.

Although we concentrated above on LTS visualisation, our approach does not rule out the use of other process-algebraic verification techniques, such as the comparison of two LTSs for behavioural equivalence. We advocate LTS visualisation as a new tool to the verifier's tool box. We claim that it is easier to use than many other tools, and it is exceptionally good for getting rapid feedback. But, of course, it is not always the best tool for a verification problem.

4. TOOL SUPPORT

The Technical Research Centre of Finland (VTT), Electronics research institute, has developed a set of tools supporting the verification approach described in the previous section. The tool set is called "Ara Tools". It is more an academic prototype than a polished product, but it can be used to test the practicality of our verification approach. Its development was supported by the Technology Development Centre of Finland (TEKES), and it has been used for research and teaching purposes in at least three Finnish universities. An earlier version of Ara Tools was described in [14].

Ara Tools consist of four main parts: State Space Generator, Normaliser, Comparator and Illustrator.

Ara State Space Generator (SSG) [10] inputs system specifications written in a variant of the ISO Specification language Lotos [2] and generates LTS representations of their behaviours. The variant is called Ara Lotos. As was explained in Section 2.3, process algebraic verification tools usually offer only restricted data manipulation facilities in their input languages, because they cannot handle systems containing large amounts of data. Correspondingly, Ara Lotos lacks most of the data handling features of the official Lotos. Ara Lotos has only the type Boolean, enumerated types and integers with a limited set of arithmetic operators.

On the other hand, Ara Lotos contains some verification-oriented additions to the official Lotos. The special keywords $error and $catastrophe can be used to mark situations which should never occur in the system. If such a situation is encountered during LTS generation, Ara SSG prunes that branch of the LTS, or stops the LTS generation completely. In this way Ara SSG can be prevented from wasting lots of time generating states which are not needed for verification, because the system made an error before entering them. (This and other kinds of error detection during LTS or state space construction are generally called on-the-fly verification.)

Ara Lotos allows the use of an LTS in the place of the definition of a process. This makes it possible to use the well-known compositional LTS construction method for generating LTSs of large systems. Compositional LTS construction consists of the replacement of one or more subsystems of a large system by their reduced LTSs before constructing an LTS for the large system. The replacement reduces the size of the LTS of the large system without changing its CFFD semantics. Compositional LTS construction makes it possible to handle many systems which would otherwise be far beyond the capacity of Ara Tools.

Ara SSG can construct LTSs with up to 10 000 states in few minutes, and 100 000 states in an hour or so. However, one should not pay too much attention to capacity figures reported as numbers of states, because the number of states of a system may vary a lot depending on how its specification was written, and techniques such as compositional LTS construction make absolute numbers of states meaningless anyway. It is better to measure capacity by the nature and complexity of systems that can be handled. We will return to this question in Section 6.

Ara Normaliser transforms LTSs into normal form. The normal form of an LTS is the smallest "semi-deterministic" LTS that has the same CFFD semantics. It is essentially a deterministic finite automaton with additional tau-transitions for representing information about divergence and stable failures. Ara Normaliser is based on well-known algorithms for determinisation and minimisation of finite automata, but contains some extra features for handling divergences and stable failures. The algorithms underlying Ara Normaliser have been described in [16].

Because of determinisation, the normal form of an LTS may in theory be exponentially bigger than the LTS itself. Fortunately, this is seldom the case in practice. Experience has shown that normal forms of LTSs arising from verification problems are typically (but not always) much smaller than the original LTSs. Because of this, Ara Normaliser can be used for LTS reduction most of the time. When normalisation fails, alternative algorithms are available, such as minimisation with respect to divergence-preserving observation equivalence [4].

Ara Normaliser can usually handle input LTSs with up to 5 000 or 10 000 states, but its capacity can be increased by using the above-mentioned algorithm from [4] as a preprocessing stage. Visualisations of LTSs produced by Ara Normaliser are pleasant to interpret, because they are essentially deterministic.

With Ara Comparator one may check whether two LTSs are equivalent according to the CFFD model. If they are not, then Ara Comparator provides an example of their differences in the form of a trace, divergence trace, or stable failure that one of them can execute but the other cannot. (Ara Comparator may also report that they differ in the ability to perform an invisible action as their first action.) Ara Comparator is based on the fact that the normal forms of CFFD-equivalent LTSs are isomorphic. Comparison of normal forms is computationally cheap, but Ara Comparator uses Ara Normaliser for obtaining the normal forms, and thus suffers from its capacity restrictions.

Ara Illustrator is an LTS visualisation tool. It produces fairly readable pictures of LTSs with at most 20 or 30 states. Because it is often difficult to find room for long action names near the corresponding transitions, Ara Illustrator represents action names by showing transitions in different colours.

5. AN EXAMPLE

An Ara Lotos specification of AB-Protocol is shown in Fig. 6. The protocol has unreliable channels of capacity one, and it makes at most two retransmissions in case of loss or delay of messages in the channels. The actions send, ok, err and rec were specified as visible by listing them at the second line of the Lotos text. All other actions model the internal operation of the protocol, and they have been made invisible with the Lotos hide operator.


(* enumerated data type for the alternating bit *)
$sort seq_nr is (s0, s1)

(* main process name + list of visible actions *)
process Abprot[ send, ok, err, rec ] :=

  hide dats, datr, acks, ackr in       (* hide actions in the channel *)
    Sender[ send, dats, ackr, ok, err ]( s0 )
      |[ dats, ackr ]|
    ( Channel[ dats, datr ] ||| Channel[ acks, ackr ] )
      |[ datr, acks ]|
    Receiver[ rec, datr, acks ]( s0 )

where

  process Sender[ send, dat, ack, ok, err ]( sn : seq_nr ) :=
    send; Try[ send, dat, ack, ok, err ]( sn, 2 )
  where
    process Try[ send, dat, ack, ok, err ]
        ( sn : seq_nr, retry_cnt : int ) :=
      hide timeout in
        dat ! sn;
        (    (* receive ack with correct seq. nr. *)
             ack ! sn; ok; Sender[ send, dat, ack, ok, err ]( sn+1 )  
          [] [ retry_cnt > 0 ]-> timeout;  (* timeout -> retry *)
             Try[ send, dat, ack, ok, err ]( sn, retry_cnt-1 )
          [] [ retry_cnt = 0 ]-> timeout; err;  (* timeout -> give up *)
             Sender[ send, dat, ack, ok, err ]( sn )
        )
    endproc
  endproc

  process Receiver[ rec, dat, ack ]( sn : seq_nr ) :=
    (* receive message from channel *)
    dat ? sr : seq_nr;
    (    (* correct seq. nr. -> deliver and ack *)
         [ sn = sr ]-> rec; ack ! sr; Receiver[ rec, dat, ack ]( sn+1 )
      [] (* incorrect seq. nr. -> acknowledge *)
         [ sn /= sr ]-> ack ! sr; Receiver[ rec, dat, ack ]( sn )
    )
  endproc

  process Channel[ min, mout ] :=
    min ? sn : seq_nr;                  (* message enters the channel *)
    (    mout ! sn; Channel[ min, mout ](* message leaves the channel *)
      [] (* message disappears in the channel
            ("i" generates a tau-transition) *)
         i; Channel[ min, mout ]
    )
  endproc

endproc

Figure 6. Ara Lotos specification of AB-Protocol


We constructed an LTS representation of AB-Protocol with Ara State Space Generator. The result has 1623 states and 3718 transitions. It is thus too big to be investigated as such. Therefore, we reduced it with Ara Normaliser, and obtained an LTS containing only 8 states and 11 transitions. The result is so small that Ara Illustrator can easily show it on a computer screen. We have redrawn it into Fig. 7, using written action names instead of colours. Although Fig. 7 was drawn by us, it is faithful to the lay-out designed by Ara Illustrator.

[WWW note: this is how ARA Illustrator shows it, except that we have moved the legend closer to the LTS.]

According to Fig. 7, the first visible action of the system is send, that is, a sending request by the customer (the "Source" of Fig. 1). The next action in Fig. 7 is either rec, or the invisible action tau followed by err. Please ignore tau-transitions for the moment; we will soon explain their significance. The send-rec-tau-ok cycle in Fig. 7 obviously corresponds to successful behaviour of AB-Protocol, because in it each message which is sent (send) is also received (rec), and the sending customer is informed about successful transmission (ok). If this cycle is exited, the sending customer is immediately informed about failure (err), and the current message is either lost (send-tau-err-tau) or delivered (send-rec-tau-err or send-tau-err-rec). If the message is delivered, then the next message will be lost (the vertical send has no subsequent rec, although it may be followed by ok).

The tau-transitions in Fig. 7 represent information about stable failures. In essence, they reveal whether the choice between two alternative courses of action was made by AB-Protocol, its environment, or jointly by both. We can see that after send and rec, the next visible action is either ok or err. Furthermore, the choice between ok and err is made by the protocol, because the ok- and err-transitions start at distinct states preceded by tau-transitions. In other words, it is the protocol, not the environment, which decides whether failure or success is reported.

The choice between rec and err after the initial send-action is more intricate. AB-Protocol has the property that the acknowledgement is sent only after the message is delivered to the receiving customer. Because actions are synchronous, the receiving customer can prevent the acknowledgement from being sent, by refusing to execute the action rec, i.e. by not accepting the arrived message. Then Sender never receives an acknowledgement and will eventually declare error. On the other hand, the environment cannot force the rec-branch to be taken by refusing the action err. If AB-Protocol chooses the err-branch and the environment refuses err, then the system deadlocks immediately before the err-transition.

It is not at all surprising that AB-Protocol may lose a message and then give an error indication. But it is surprising at the first sight that AB-Protocol may deliver the message, and still give the error indication. And it is actually quite worrying that if this happens, then AB-Protocol is guaranteed to lose the next message, and may still give an ok-indication for it!

These properties of AB-Protocol have a natural explanation, however. If the original message and all its retransmissions are lost, then Sender declares an error, and AB-Protocol continues normally. But if at least one of the messages gets through and their acknowledgements are lost, then Receiver delivers the message to the receiving customer and changes its sequence number, but Sender does not change its own sequence number. Therefore, Receiver will interpret all transmissions of the next message as retransmissions of the previous message, and will acknowledge them without delivering the message to the customer. If an acknowledgement reaches Sender, it incorrectly concludes that the message was delivered, and declares success.

So we see that it is necessary to add some mechanism to AB-Protocol with which Sender and Receiver can synchronise their sequence numbers after errors. [WWW note: such a mechanism is explained and analysed in [13].] The moral of the story is that even if we had not originally thought about the above scenario, it would have been quite difficult to fail to see from Fig. 7 that something is wrong in the protocol.

6. DISCUSSION

As the example in Section 5 demonstrates, our approach is based on constructing and showing a black-box view of the system under analysis. In this view, everything "essential" regarding the visible actions is preserved, while invisible activity is discarded. What exactly is considered as "essential" is determined by the CFFD model.

The CFFD model is somewhat abstract. The presence and absence of tau-transitions in the figures produced by Ara Normaliser and Ara Illustrator may seem surprising until the user thoroughly understands what the CFFD model means. Despite this, we believe firmly that most people will find it easier to verify systems by interpreting (automatically constructed) LTS figures than by writing and (more or less automatically) proving, say, linear temporal logic formulae. Actually, it can be proven that the information preserved by the CFFD model is roughly the same as the information preserved by typical linear temporal logics [7]. Therefore, most of the problems in understanding our approach are present also in competing approaches; our emphasis on visualisation just makes them more apparent. (We have to admit, however, that - like almost all process-algebraic approaches - ours cannot handle so-called fairness assumptions as elegantly as temporal logics can, see [15].)

The use of black-box views is inherent in process algebras, so it is definitely not original by us. However, the observation that black-box views can be made small enough for visualisation is ours, as well as are the basic techniques (e.g. CFFD-semantics, see [12]) which make this possible.

Our approach renders it possible to obtain easily and rapidly feedback from design ideas. The feedback is comprehensive in the sense that it represents simultaneously all possible executions of the system. Therefore, it provides information that cannot be acquired by testing. These properties make our approach valuable during the development of a design.

A couple of words concerning the kind and size of systems that can be handled with our approach is appropriate. Although we used a telecommunication protocol in the examples of this paper, our approach can be used to concurrent and reactive systems in general. It is at its best when the interactions between the processes of the system are complicated, but the processes are otherwise relatively simple. In particular, the processes should not contain lots of local data, or their behaviours should be mostly independent of it.

Like all methods based on explicit representation of the behaviour of a system, ours suffers from the well-known state explosion performance problem. Many techniques for alleviating the state explosion problem have been suggested in the literature, and many of them may be and some of them have been included into Ara Tools, most notably compositional LTS construction. With these methods, non-trivial and sometimes spectacularly complex systems can be verified. In general, though, Ara Tools (or its competitors) can be made to swallow only very small "real" systems. Even so, certainly much bigger systems can be verified with Ara Tools than with only paper and pencil.

ACKNOWLEDGEMENTS

This paper summarises a large amount of work that was funded mainly by the Technology Development Centre of Finland (TEKES) and VTT Electronics. Part of the work was done within the European Union ESPRIT III BRA project REACT (6021). The authors are indebted to Jukka Kemppainen for his contribution to the development of Ara Tools.

REFERENCES

[1] Bartlett, K. A., Scantlebury, R. A. & Wilkinson, P. T.: A Note on Reliable Full-Duplex Transmission over Half-Duplex Links. Communications of the ACM 12 (5) 1969, pp. 260-261.

[2] Bolognesi, T. & Brinksma, E.: Introduction to the ISO Specification Language LOTOS. Computer Networks and ISDN Systems 14, 1987 pp. 25-59.

[3] Craigen, D., Gerhart, S. & Ralston, T.: Formal Methods Reality Check: Industrial Usage. Proc. Formal Methods Europe '93, Lecture Notes in Computer Science 670, Springer-Verlag 1993, pp. 250-267.

[4] Eloranta, J.: Minimal Transition Systems with Respect to Divergence Preserving Behavioural Equivalences. Doctoral Thesis, University of Helsinki, Department of Computer Science, Report A-1994-1, Helsinki, Finland, 1994, 162 p.

[5] Hoare, C. A. R.: Communicating Sequential Processes. Prentice-Hall 1985, 256 p.

[6] Inverardi, P. & Priami, C.: Evaluation of Tools for the Analysis of Communicating Systems. EATCS Bulletin 45, October 1991, pp. 158-185.

[7] Kaivola, R. & Valmari, A.: The Weakest Compositional Semantic Equivalence Preserving Nexttime-less Linear Temporal Logic. Proc. CONCUR '92, LNCS 630, Springer-Verlag 1992, pp. 207-221.

[8] Milner, R.: Communication and Concurrency. Prentice-Hall 1989, 260 p.

[9] Peterson, G. L.: Myths about the Mutual Exclusion Problem. Information Processing Letters 12 (3) 1981, pp. 115-116.

[10] Savola, R.: A State Space Generation Tool for LOTOS Specifications. VTT Publications 241, Technical Research Centre of Finland (VTT), Espoo, Finland 1995, 107 p.

[11] Setšlš, M. & Valmari, A.: Validation and Verification with Weak Process Semantics. Proc. Nordic Seminar on Dependable Computing Systems 1994, Lyngby, Denmark, August 1994, pp. 15-26.

[12] Valmari, A.: Failure-Based Equivalences Are Faster than Many Believe. Proc. Structures in Concurrency Theory 95, Workshops in Computing Series, Springer-Verlag, 1995, pp. 326-340.

[13] Valmari, A., Karsisto, K. & Setšlš, M.: Visualisation of Reduced Abstracted Behaviour as a Design Tool. Accepted to PDP '96, Fourth Euromicro Workshop on Parallel and Distributed Processing, Braga, Portugal, January 24-26, 1996.

[14] Valmari, A., Kemppainen, J., Clegg, M. & Levanto, M.: Putting Advanced Reachability Analysis Techniques Together: the "ARA" Tool. Proc. Formal Methods Europe '93, LNCS 670, Springer-Verlag 1993, pp. 597-616.

[15] Valmari, A. & Setšlš, M.: Visual Verification of Safety and Liveness. Tampere University of Technology, Software Systems Laboratory, Report 17, Tampere, Finland, September 1995, 20 p. [WWW note: to appear in Proc. Formal Methods Europe '96.]

[16] Valmari, A. & Tienari, M.: An Improved Failures Equivalence for Finite-State Systems with a Reduction Algorithm. Protocol Specification, Testing and Verification XI (Proc. 11th International Symposium), North-Holland 1991, pp. 3-18.

[17] Valmari, A. & Tienari, M.: Compositional Failure-Based Semantic Models for Basic LOTOS. Formal Aspects of Computing (1995) 7: 440-468. [WWW note: updated, was "in print in . . .".]