UNIVERSITY OF OSLO Department of Informatics
8th International Symposium on Formal Aspects of Component
Software
Prelinary Proceedings
Research Report 410 Farhad Arbab
and
Peter Csaba Ölveczky (editors)
I SBN 82-7368-372-9 I SSN 0806-3036
September 2011
Preface
This volume contains the preliminary proceedings of the8th International Sym- posium on Formal Aspects of Component Software (FACS 2011), held at the Department of Informatics, University of Oslo, on September 14–16, 2011. FACS 2011 is the eighth event in a series of events founded by the International Insti- tute for Software Technology of the United Nations University (UNU-IIST).
The objective of FACS is to bring researchers and practitioners of component software and formal methods together in order to foster a deeper understanding of reliable component-based systems development and their applications, us- ing formal methods. The component-based software development approach has emerged as a promising paradigm to cope with the complexity of present-day software systems by bringing sound engineering principles into software engineer- ing. However, many challenging conceptual and technological issues still remain in component-based software development theory and practice. Moreover, the advent of service-oriented computing has brought to the fore new dimensions, such as quality of service and robustness to withstand inevitable faults, that re- quire revisiting established component-based concepts in order to meet the new requirements of the service-oriented paradigm.
FACS 2011 is concerned with how formal methods can be used to make component-based and service-oriented software development succeed. Formal methods have provided a foundation for component-based software by success- fully addressing challenging issues such as mathematical models for components, composition and adaptation, or rigorous approaches to verification, deployment, testing, and certification. The symposium seeks to address the applications of formal methods in all aspects of software components and services.
In the call for papers for FACS 2011, we solicited high-quality submissions reporting on: original research contributions; applications and experiences; sur- veys, comparisons, and state-of-the-art reports; and tools. We also solicited sub- missions to the Doctoral Track of FACS 2011, in the form of two-page abstracts, concisely describing PhD-work-in-progress, conveying related theme, context, research questions, envisaged contributions, and partial results.
We received 46 submissions from 26 countries, out of which the program com- mittee accepted 16 as regular papers, and, furthermore, conditionally accepted 4 additional papers. We received 9 submissions to the FACS 2011 Doctoral Track, out of which we accepted 4. Each submission to FACS 2011 was reviewed by at least 3 independent referees. In addition to the contributed papers, we consider ourselves very fortunate that our program also includes invited talks by Jos´e Meseguer, John Rushby, and Ketil Stølen.
Revised versions of accepted regular papers will appear in the post-proceedings of FACS 2011 that will be published as a volume in Springer’s LNCS series. Ex- tended versions of selected papers from the symposium will also appear in a special issue of theScience of Computer Programming journal.
Many colleagues and friends have contributed to FACS 2011. First, we thank the authors who submitted their work to FACS 2011 and who, by their contri- butions and participation, make this symposium a high-quality event. We thank
the program committee members and their sub-reviewers for their timely and insightful reviews as well as for their involvement in the post-reviewing discus- sions. We are also grateful to the FACS Steering Committee for its guidance, to the invited speakers, and, in particular, to Lucian Bentea for all his assistance in organizing this event. Finally, we thank Andrei Voronkov for the excellent EasyChair conference system, which made our job so much simpler, and the Re- search Council of Norway and the Department of Informatics at the University of Oslo for financially supporting the symposium.
September, 2011 Farhad Arbab
Peter Csaba ¨Olveczky
Symposium Organization
Program Chairs
Farhad Arbab (Leiden University and CWI) Peter Csaba ¨Olveczky (University of Oslo)
Steering Committee
Zhiming Liu (coordinator) (IIST UNU, Macau) Farhad Arbab (Leiden University and CWI) Lu´ıs Barbosa (University of Minho)
Carlos Canal (University of M´alaga)
Markus Lumpe (Swinburne University of Technology) Eric Madelaine (INRIA, Centre Sophia Antipolis) Peter Csaba ¨Olveczky (University of Oslo) Corina P˘as˘areanu (NASA Ames)
Bernhard Sch¨atz (fortiss GmbH)
Program Committee
Erika ´Abrah´am (RWTH Aachen University) Farhad Arbab (CWI and Leiden University) Christel Baier (Technical University of Dresden) Lu´ıs Barbosa (Universidade do Minho)
Mihaela Bobaru (NASA/JPL)
Christiano Braga (Universidade Federal Fluminense) Roberto Bruni (University of Pisa)
Carlos Canal (University of M´alaga) Frank De Boer (CWI)
Francisco Duran (University of M´alaga)
Rolf Hennicker (Ludwig-Maximilians-Universit¨at M¨unchen) Alexander Knapp (Augsburg University)
Zhiming Liu (IIST UNU)
Markus Lumpe (Swinburne University of Technology) Eric Madelaine (INRIA, Centre Sophia Antipolis) Sun Meng (Peking University)
Peter Csaba ¨Olveczky (University of Oslo) Corina P˘as˘areanu (NASA Ames)
Frantiˇsek Pl´aˇsil (Charles University) Gwen Sala¨un (Grenoble INP - INRIA) Bernhard Sch¨atz (fortiss GmbH) Wolfram Schulte (Microsoft Research)
Nishant Sinha (NEC Labs, Princeton) Marjan Sirjani (Reykjavik University) Volker Stolz (University of Oslo) Carolyn Talcott (SRI International) Emilio Tuosto (University of Leicester)
External Reviewers
Adam, Ludwig Khamespanah, Ehsan
Ardourel, Gilles Khosravi, Ramtin
Bauer, Sebastian Kofron, Jan
Baumeister, Hubert Komuravelli, Anvesh
Bertolini, Cristiano Lang, Frederic
Blech, Jan Olaf Lepri, Daniela
Chen, Zhenbang Lluch Lafuente, Alberto
Choppy, Christine Loup, Ulrich
Corzilius, Florian Malohlava, Michal
Dan, Li Melgratti, Hernan
Faber, Johannes Morisset, Charles
Guanciale, Roberto Nellen, Johanna
Helvensteijn, Michiel Ouederni, Meriem
Henrio, Ludovic Pfaller, Christian
H¨olzl, Florian Poch, Tomas
Jaghoori, Mohammad Mahdi Ramalho, Franklin
Jansen, Nils Rodrigues, Genaina
Jezek, Pavel Sabouri, Hamideh
Jongmans, Sung Schlatte, Rudolf
Kemper, Stephanie Sch¨af, Martin
Keznikl, Jaroslav Verdejo, Alberto
Khakpour, Narges Vogler, Walter
Khalil, Maged
Table of Contents
Taming Distributed System Complexity through Formal Patterns . . . 1 Jose Meseguer
Composing Safe Systems. . . 3 John Rushby
Components and Risk . . . 12 Ketil Stølen
Synthesis of Hierarchical Systems . . . 13 Benjamin Aminof, Fabio Mogavero and Aniello Murano
A Modal Specification Theory for Components with Data. . . 31 Sebastian Bauer, Kim Guldstrand Larsen, Axel Legay, Ulrik Nyman and Andrzej Wasowski
Evaluating the performance of model transformation styles in Maude. . . . 49 Roberto Bruni and Alberto Lluch Lafuente
Interactive Transformations from Object-Oriented Models to
Component-Based Models. . . 67 Li Dan, Xiaoshan Li, Zhiming Liu and Volker Stolz
Runtime Verification of Temporal Patterns for Dynamic
Reconfigurations of Components. . . 85 Julien Dormoy, Olga Kouchnarenko and Arnaud Lanoix
Timed Conformance Testing for Orchestrated Service Discovery. . . 103 Jose Pablo Escobedo, Christophe Gaston and Pascale Le Gall
Realizability of Choreographies for Services Interacting Asynchronously. . 121 Gregor G¨ossler and Gwen Sala¨un
Networks of Real-Time Actors: Schedulability Analysis and Coordination 138 Mohammad Jaghoori, ´Olafur Hlynsson and Marjan Sirjani
A Formal Model of Object Mobility in Resource-Restricted Deployment
Scenarios . . . 156 Einar Broch Johnsen, Rudolf Schlatte and Silvia Lizeth Tapia Tarifa The Logic of XACML . . . 174
Carroline Dewi Puspa Kencana Ramli, Hanne Riis Nielson and Flem- ming Nielson
A proof assistant based formalization of MDE components . . . 192 Mounira Kezadri, Benoˆıt Combemale, Marc Pantel and Xavier Thiri- oux
Controlling an iteration-wise coherence in dataflow. . . 210 S´ebastien Limet, Sophie Robert and Ahmed Turki
Learning from Failures: a Lightweight Approach to Run-Time
Behavioural Adaptation . . . 228 Jos´e Antonio Mart´ın, Antonio Brogi and Ernesto Pimentel
Analyzing Reconfigurable Component-Based Systems Using Attribute
Grammars . . . 245 Thomas Noll
Verifying Safety of Fault-Tolerant Distributed Components. . . 263 Ameur-Boulifa Rab´ea, Raluca Halalai, Ludovic Henrio and Eric Made- laine
Reducing the Model Checking Cost of Product Lines Using Static
Analysis Techniques. . . 281 Hamideh Sabouri and Ramtin Khosravi
Bigraphical Modelling of Architectural Patterns. . . 298 Alejandro Sanchez, Luis Soares Barbosa and Daniel Riesco
A formal framework for coordinated simulation of heterogeneous
service-oriented applications. . . 316 Patrizia Scandurra, Elvinia Riccobene, Davide Brugali and Luca Gher- ardi
Verifying Temporal Properties of Use-Cases in Natural Language . . . 334 Viliam Simko, David Hauzar, Tomas Bures, Petr Hnetynka and Fran- tisek Plasil
Connectors as Designs: the Time Dimension. . . 352 Meng Sun
A New Component Model for Highly Distributed Environments. . . 370 Antoine Beugnard and Ali Hassan
A Rewriting-Logic-Based Tool for Object-Oriented Formal Modeling
and Analysis of Interacting Hybrid Systems . . . 373 Muhammad Fadlisyah
Formal Aspects of Component-Based Design of Embedded Real-Time
Systems . . . 376 Andrey Kruglyak
Analysis of Cooperating Systems by Refined Over-Approximations . . . 379 Nils Semmelrock
Taming Distributed System Complexity through Formal Patterns
Jos´e Meseguer
University of Illinois at Urbana-Champaign Many current and future distributed systems are or will be:
– real-time and cyber-physical
– probabilistic in their operating environments and/or their algorithms – safety-critical, with strong qualitative and quantitative formal requirements – reflective and adaptive, to operate in changing and potentially hostile envi-
ronments.
Their distributed features, their adaptation needs, and their real-time and prob- abilistic aspects make such systems quite complex and hard to design, build and verify, yet their safety-critical nature makes their verification essential. One im- portant source of complexity, causing many unforeseen design errors, arises from ill-understood and hard-to-test interactions between their different distributed components.
Methods to tame and greatly reduce system complexity are badly needed.
System complexity has many aspects, including the complexity and associated cost of:
– designing – verifying – developing – maintaining and – evolving
such systems.
The main goal of this talk is to propose the use of formal patterns as a way of drastically reducing all the above system complexity aspects. By a ”formal pattern” I mean a solution to a commonly occurring problem that is:
– as generic as possible, with precise semantic requirements for its parameters – formally specified
– executable, and
– comes with strong formal guarantees.
This means that a formal pattern can be applied to a potentially infinite set of concrete instances, where each such instance is correct by construction and enjoys the formal guarantees ensured by meeting the semantic requirements of the pattern’s parameters.
The overall vision is that distributed systems should be designed, verified, and built by composing formal patterns that are highly generic and reusable and
come with strong formal guarantees. A large part of the verification effort is spent in an up-front, fully generic manner, and is then be amortized across a potentially infinite number of instances. As I will show through concrete examples, this can achieve very drastic reductions in all aspects of system complexity, including the formal verification aspect. It can lead to high-quality, highly reliable distributed systems at a fraction of the cost required not using such patterns.
To develop formal patterns for distributed systems with features such as those mentioned above an appropriate semantic framework is needed, one supporting:
– concurrency
– real time and probabilities
– distributed reflection and adaptation, and – formal verification methods.
I will use rewriting logic as such a semantic framework, and will show in number of examples its adequacy to specify and verify formal patterns of this nature.
Composing Safe Systems
?John Rushby Computer Science Laboratory
SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 USA
Abstract. We consider component-based systems that must ensure crit- ical properties such as safety. We describe the value of partitioning, and of assumption synthesis, and suggest areas for further research.
1 Introduction
We build systems from components, but what makes something a system is that its properties and behavior are distinct from those of its components. As engi- neers and designers, we wish to predict and calculate the properties of systems from those of their components and their interconnections, and we are quite suc- cessful at doing this,most of the time. For many systems and properties, “most of the time” is good enough: we can live with it if our laptop occasionally locks up, our car doesn’t start, or our music player seems to lose our playlists. But we will be considerably more aggrieved if our laptop catches fire, our car fails to stop, or our music player loses the songs that we purchased. As we move from personal systems to those with wider impact and from properties about normal function to those that concern safety or security, so “most of the time” becomes inadequate: we want those properties to be trueall the time.
Often, properties that we want to be true “all the time” fail to be so, and sub- sequent investigations generally reveal some unexpected interaction among the system’s components. Thus, attempts to reason about the properties of systems by combining orcomposing the properties of their components, while generally successful for “most of the time” properties, are less successful for “all the time”
properties. It is for this reason that regulatory bodies examine only complete systems (e.g., the FAA certifies only airplanes and engines) and not compo- nents: they need to consider the behavior and possible interactions of multiple components in the context of a specific system.
Now, although it is generally infeasible at present to guarantee critical “all the time” properties by compositional (or “modular”) methods, it is a good research
?This work was supported by National Science Foundation Grant CNS-0720908. The content is solely the responsibility of the author and does not necessarily represent the official views of NSF.
topic to investigate why this is so, and how we might extend the boundaries of what is feasible in this area. Safety, in the sense of causing no harm to the public, is one of the most demanding properties, and so the motivation for the title of this paper is to indicate a research agenda focussed on methods that might allow certification of safety for complex systems by compositional means.
As mentioned, system safety failures and the attendant flaws in compositional reasoning are generally due to unanticipated interactions among components.
These interactions can be classified into those that exploit a previously unantici- pated pathway for interaction, and those that are due to unanticipated behavior along a known interaction pathway. One way to control the first class of unantic- ipated interactions is to use integration frameworks that restrict the pathways available for component interactions; in avionics, this approach is called “parti- tioning” and it is the topic of Section 2.
There are two complementary ways to deal with the second class of unan- ticipated interactions: one is to design components that deal gracefully with anything their environment can do to them; the other is to figure out what each component can deal with, and to ensure that its environment does not subject it to anything outside that space. These ways of trying to “anticipate the unanticipated” are among the topics of Section 3, which mainly focuses on assume/guarantee and methods for assumption synthesis. Brief conclusions are provided in Section 4.
2 Partitioning
Aircraft are safe, yet employ many interacting subsystems, so the techniques they employ are worthy of interest. Traditionally, the various avionics functions on board aircraft were provided by fairly independent subsystems loosely integrated as a “federated” system. This meant that the autopilot, for example, had its own computers, replicated for redundancy, and so did the flight management system.
The two system would communicate though the exchange of messages, but their relative isolation provided a natural barrier to the propagation of faults: a faulty autopilot might send bad data to the flight management system, but could not destroy its ability to calculate or communicate.
Modern aircraft employ Integrated Modular Avionics (IMA) where many critical functions share the same computer system and communications net- works, and so there is naturally concern that a fault in one function could prop- agate to others sharing the same resources. Hence, the notion of “robust par- titioning” has developed [11]: the idea is that applications that use the shared resources should be protected from each other as if they were not sharing and each had their own private resource.
The primary resources that require partitioning are communication and com- putation: i.e., networks and processors. For networks, the concern is that a faulty or malicious component will not adhere to the protocol and will transmit over other traffic, or will transmit constantly, thereby denying good components the ability to communicate. The only way to provide partitioning in the face of
these threats is to employ redundancy, so that components’ access to the net- work is mediated by additional components that limit the rate or the times at which communication can occur. Of course, these additional components and the mechanisms they employ may themselves be afflicted by faults (e.g., transient hardware upsets caused by ambient radiation), and so the design and assurance of these partitioning network technologies are very demanding [14], but they are now reasonably well understood and available “off the shelf.”
For processors, the concerns are that faulty or malicious processes will write into the memory of other processes, monopolize the CPUs, or corrupt the pro- cessor’s state. Partitioning against these threats can be provided by a strong operating system or, more credibly, by a hypervisor or virtual machine environ- ment; minimal hypervisors specialized to the partitioning function are known as “separation kernels” [12] and, like partitioning networks, efficient and highly assured examples are now available “off the shelf.”
Partitioning for the basic resources of communication and computation can be leveraged to provide partitioning for additional resources synthesized above them, such as file systems. A collection of partitioning resources may be config- ured to specify quite precisely what software components are allocated to each partition and what interactions are allowed with other components (the configu- ration for an IMA is many megabytes of data). For some system-level properties, for example certain notions of security, such configurations, which may be por- trayed as box and arrow diagrams and formalized as “policy architectures” [1], provide strong assurance.
sanitizer unclassified
secret
Fig. 1.A Partitioned System Configured to Support the System Purpose
The properties for which partitioning, on its own, provides adequate enforce- ment and assurance are those that can be stated in terms of the absence of information flow. As mentioned, certain security concerns are of this kind (e.g.,
“no flow from secret to unclassified”), but most properties also concern the computations that take place in (some of) the partitions. For example, many secure systemsdo allow flow from secret to unclassified provided the informa- tion concerned is suitably “sanitized” by some function interposed in the flow, as portrayed by the minimal policy architecture of Figure 1. The partitioning configuration ensures the sanitizer cannot be bypassed, but we still require as- surance that the sanitizer does its job. More complex properties, such as most notions of safety, cannot be ensured by individual components; instead, they emerge from the interactions of many—but partitioning eliminates unintended interactions and allows us to focus on correctness of the intended interactions, which is the topic of the next section.
3 Assumption Synthesis
If we suppose that “traditional software engineering” is able to develop systems that work “most of the time” then it might be possible to turn these into systems that work “all the time” by simply blocking the events or interactions that cause failures, or by controlling the failures that are precipitated. These topics are ad- dressed by a variety of techniques such as systematic exception handling [4], anomaly detection [2], safety kernels [13] and enforceable security [18], and run- time monitoring [6]. All these techniques merely reduce the frequency or severity of failures (e.g., by turning “malfunction” or “unintended function” into “loss of function”) rather than eliminate them. However, they can be very valuable in systems with many layers of redundancy or fault management, since these often cope very well with the “clean” failure of subsystems, but less well with their misbehavior. Some aircraft systems employ “monitored architectures,” where a very simple component monitors the system safety property and shuts down the operational component if this is violated; these architectures can support rather strong assessments of reliability [9].
To get from clean failures to true “all the time” systems by compositional means, we need to be able to calculate the properties of the composed system from those of its components; if the calculation is automated, then it can support an iterative design loop: if the composed system does not satisfy the properties required, then we modify some of the components and their properties and repeat the calculation.
The established way to calculate the properties of interacting components is by assume/guarantee reasoning [7]: we verify that component Adelivers (or guarantees) propertypon the assumption that its environment delivers property q, and we also verify that B guarantees q assuming p; then when A and B are composed, each becoming the environment of the other, we may conclude (under various technical conditions) that their compositionA||Bguarantees both p and q. There is, however, a practical difficulty with this approach: if A and B are intended for general use, they are presumably developed in ignorance of each other, and it will require good fortune or unusual prescience that they should each make just the right assumption and guarantee that they fit together perfectly.
Shankar proposes an alternative approach [19] that treats assumptions as abstract components; here, we establish thatpis a property ofAin the context of an ideal environmentE; if we can then show thatBas a refinement of (i.e., a component with fewer behaviors than)E, then the composition ofAandBalso deliversp. This requires less prescience because we do not need to know about Bat the time we designA; we do, however, need to postulate a suitableE.
One interesting idea is to designA, thencalculate Eas the weakest environ- ment under which we can guarantee that A delivers p. When A is a concrete state machine, this can be done usingL∗learning [5]. Early in the design process, however, we are unlikely to have developedAto the point where it is available as a fully concrete design; in this case we can often perform assumption synthesis interactively using infinite bounded model checking (inf-BMC)
control_out errorflag data_in
data_in
control_out errorflag
m_data
ideal
distributor checker
controller c_data
ideal_out
safe_out
fault con_out
data_in
merror cerror
mon_out controller
(monitor)
assumptions
violation
Fig. 2.A Self-Checking Pair, and Additional Components Used in Analysis
Inf-BMC performs bounded model checking on state machines defined over the theories supported by an SMT solver (i.e., a solver for Satisfiability Mod- ulo Theories) [15]; these theories include equality over uninterpreted functions, possibly constrained by axioms, so it is possible to specify very abstract state machines. An example, taken from [16], is illustrated in Figure2. Here, the goal is to deduce the assumptions under which a self-checking pair works correctly.
Self-checking pairs are used quite widely in safety-critical systems to provide protection against random hardware faults: two identical processors perform the same calculations and their results are compared; if they differ the pair shuts down (thereby becoming a “fail-stop” processor [17]) and some higher-level fault management activity takes over. Obviously, this does not work if both processors become faulty and compute the same wrong result. We would like to learn if there are any other scenarios that can cause a self-checking pair to deliver the wrong result; we can then assess their likelihood (for example, the double fault scenario just described may be considered extremely improbable) and calculate the overall reliability of this architecture.
The scenarios we wish to discover are, on one hand, thehazards to the design and, on the other (i.e., when negated), its assumptions. At the system level, hazard analysis is the cornerstone of safety engineering [8]; in component-based design, assumption discovery—its dual—could play a similar rˆole: it helps us understand when it is safe for one component to become the environment for another.
Because of its context (the environment of a self-checking pair really is the natural environment, rather than another system), this example is closer to haz- ard discovery than assumption synthesis—but since these are two sides of the same coin, it serves to illustrate the technique. The idea is that thecontroller and the monitor are identical fault-prone computers that compute some un- interpreted function f(x); a distributor provides copies of the input x to both computers and the results are sent to achecker; if the results agree, the checker passes one of them on assafe out, otherwise it raises afaultflag. The distributor as well as the two computers can deliver incorrect outputs, but for simplicity of exposition the checker is assumed to be perfect (the checker can be eliminated by having thecontrollerandmonitorcross-check their re- sults). Anideal computer, identical to the others but not subject to failures, serves as the correctness oracle, and an assumptions module, which operates as asynchronous observer, encodes the evolving assumptions. In the figure, the idealand assumptionsmodules and their associated data are shown in red to emphasize that these are artifacts of analysis, not part of the component under design.
Initially, the assumptions are empty and we use inf-BMC to probe correctness of the design (i.e., we attempt to verify the claim that if thefaultflag is down, thensafe outequalsideal out). We obtain a counterexample that alerts us to a missing assumption; we add this assumption and iterate. The exercise terminates after the following assumptions are discovered.
1. When both members of the self-checking pair are faulty, their outputs should differ (this is the case we already thought of).
2. When the members of the pair receive different inputs1(i.e., when the dis- tributor is faulty), their outputs should differ. There are two subcases here.
(a) Neither member of the pair is faulty. The scenario here is that instead of sending the correct valuex, the distributor sendsyto one member of the pair andzto another, butf(y) =f(z) (andf(y)6=f(x)).
(b) One or both of the pair are faulty. Here, the scenario is the distributor sends the correct valuexto the faulty member, and an incorrect valuey to the nonfaulty member, butf(y) =f0(x), wheref0is the computation of the faulty member.
3. When both members of the pair receive the same input, it is the correct input.
Inf-BMC can verify that the self-checking pair works, given these four as- sumptions, so our next task is to examine them.
Cases 1 and 2(b) require double faults and may be considered improbable.
Case 2(a) is interesting because it probably would not be discovered by finite state model checking, where we do not have uninterpreted functions: instead,
1Readers may wonder how a distributor, whose implementation could be as simple as a solder joint connecting two wires, can change values; one possibility is it adds resistance and drops the voltage: some receivers will see a weak voltage as a 1, and others as a 0.
the usual way to analyze an “abstract” design is to provide a very simple “con- cretization,” such as replacingf(x) byx+1. This case is also interesting because, once discovered, it can be eliminated by modifying the design: simply cause each member of the pair to pass its input as well as its output to the checker; since both computers are nonfaulty, the inputs will be passed correctly to the checker, which will then rise the fault flag because it sees that the inputs differ. That leaves case 3 as the one requiring further consideration (which we do not pursue here) by those who would use a self-checking pair.
This example has illustrated, I hope, how automated methods such as inf- BMC can be used to help calculate the weakest assumptions required by a com- ponent, and thereby support the design of systems in which components’ assumes and guarantees are mutually supportive, without requiring prescience.
4 Conclusions
All fields of engineering build on components, and it is natural that computer science should do the same. However, component-based systems can be rather more challenging in computer science than in other fields because of the com- plexity of interaction—unintended as well as in intended—that is possible. This complexity of interaction becomes even more vexatious when we aim to develop safety-critical and other kinds of system that must work correctly all the time.
(Perrow [10] would argue that unintended interactions and their enablers, “inter- active complexity” and “tight coupling,” are the primary causes of disasters in all engineering fields; however, computer systems generally have more complexity of these kinds, even in normal operation, than those of other fields.)
Unintended interactions can be divided into those that deliver unintended be- havior along intended pathways, and those that employ unintended pathways. I have outlined techniques that can ameliorate these concerns.Partitioning aims to eliminate unintended pathways for interaction in networks and processors and higher-level resources built on these. Partitioning guarantees “preservation of prior properties” when new components are added to an existing system; it also seems sufficient, on its own, to guarantee certain kinds of information flow security properties, and to simplify the assured construction of more complex properties of this kind [3]. With unintended pathways controlled by partitioning, we can turn to interactions along known pathways. Various techniques akin to wrapping and monitoring can reduce bad inputs and outputs, but ultimately we need to calculate the composed behavior of interacting components. Traditional methods of assume/guarantee reasoning demand a degree of prescience to en- sure that the assumes of one component are met by the guarantees of another, designed in ignorance of it. One way to lessen this need for prescience is to derive the weakest assumptions under which a component can deliver its guarantees, and I sketched how inf-BMC can be used to help automate this process (which is closely related to hazard analysis) very early in the design cycle.
Compositional design and assurance for critical systems that must function correctly, or at least safely, all the time, are challenging and attractive research
topics. Further systematic examination and study of the methods and directions I described could be worthwhile, but fresh thinking would also be welcome.
References
1. Boettcher, C., DeLong, R., Rushby, J., Sifre, W.: The MILS component integration approach to secure information sharing. In: 27th AIAA/IEEE Digital Avionics Systems Conference. The Institute of Electrical and Electronics Engineers, St. Paul, MN (Oct 2008) 3
2. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Com- puting Surveys 41(3) (2009) 4
3. Chong, S., van der Meyden, R.: Using architecture to reason about information security. Tech. rep., University of New South Wales (2009) 7
4. Cristian, F.: Exception handling and software fault tolerance. IEEE Transactions on Computers C-31(6), 531–540 (Jun 1982) 4
5. Giannakopoulou, D., Pasareanu, C.S., Barringer, H.: Component verification with automatically generated assumptions. International Journal on Automated Soft- ware Engineering 12(3), 297–320 (2005) 4
6. Havelund, K.: Program Monitoring; Course material for part II of Caltech CS 119 (2008 May), available athttp://www.runtime-verification.org/course/ 4 7. Jones, C.B.: Tentative steps toward a development method for interfering pro-
grams. ACM Trans. Progr. Lang. Syst. 5(4), 596–619 (1983) 4
8. Leveson, N.G.: Safeware: System Safety and Computers. Addison-Wesley (1995) 5 9. Littlewood, B., Rushby, J.: Reasoning about the reliability of fault-tolerant systems in which one component is “possibly perfect”. IEEE Transactions on Software Engineering (2011), accepted for publication 4
10. Perrow, C.: Normal Accidents: Living with High Risk Technologies. Basic Books, New York, NY (1984) 7
11. Requirements and Technical Concepts for Aviation, Washington, DC: DO-297: In- tegrated Modular Avionics (IMA) Development Guidance and Certification Con- siderations (Nov 2005), also issued as EUROCAE ED-124 (2007) 2
12. Rushby, J.: The design and verification of secure systems. In: Eighth ACM Sympo- sium on Operating System Principles. pp. 12–21. Asilomar, CA (Dec 1981), (ACM Operating Systems Review, Vol. 15, No. 5) 3
13. Rushby, J.: Kernels for safety? In: Anderson, T. (ed.) Safe and Secure Computing Systems, chap. 13, pp. 210–220. Blackwell Scientific Publications (1989) 4 14. Rushby, J.: Bus architectures for safety-critical embedded systems. In: Henzinger,
T., Kirsch, C. (eds.) EMSOFT 2001: Proceedings of the First Workshop on Embed- ded Software. Lecture Notes in Computer Science, vol. 2211, pp. 306–323. Springer- Verlag, Lake Tahoe, CA (Oct 2001) 3
15. Rushby, J.: Harnessing disruptive innovation in formal verification. In: Hung, D.V., Pandya, P. (eds.) Fourth International Conference on Software Engineering and Formal Methods (SEFM). pp. 21–28. IEEE Computer Society, Pune, India (Sep 2006) 5
16. Rushby, J.: A safety-case approach for certifying adaptive systems. In: AIAA In- fotech@Aerospace Conference. American Institute of Aeronautics and Astronau- tics, Seattle, WA (Apr 2009), aIAA paper 2009-1992 5
17. Schlichting, R.D., Schneider, F.B.: Fail-stop processors: An approach to designing fault-tolerant computing systems. ACM Transactions on Computer Systems 1(3), 222–238 (Apr 1983) 5
18. Schneider, F.: Enforceable security policies. ACM Transactions on Information and System Security 3(1), 30–50 (Feb 2000) 4
19. Shankar, N.: Lazy compositional verification. In: de Roever, W.P., Langmaack, H., Pnueli, A. (eds.) Compositionality: The Significant Difference (Revised lectures from International Symposium COMPOS’97). Lecture Notes in Computer Science, vol. 1536, pp. 541–564. Springer-Verlag, Bad Malente, Germany (Sep 1997) 4
Components and Risk
Ketil Stølen
SINTEF ICT, P.O. box 124 Blindern, 0314 Oslo, Norway Department of Informatics, University of Oslo, Norway
{ketil.stolen}@sintef.no
Abstract. Risk analysis is an important tool to establish the apropri- ate protection or safety level of a system. Unfortunately, the shifting environment of components is not adequately addressed by traditional risk analysis methods. Furthermore, the issue of risk is hardly addressed within methods for component-based system development. This talk will present challenges with respect to components and risk from three view- points, namely an industrial viewpoint, a modelling viewpoint and a the- oretical viewpoint. The presentation builds on the following publications [3], [1] and [2].
References
1. Brændeland, G., Refsdal, A., Stølen, K.: Modular analysis and modelling of risk scenarios with dependencies. J. Syst. Softw. 83(10), 1995–2013 (2010)
2. Brændeland, G., Refsdal, A., Stølen, K.: A denotational model for component-based risk analysis. Research report 363, University of Oslo (2011)
3. Lund, M.S., Solhaug, B., Stølen, K.: Model-Driven Risk Analysis – The CORAS Approach. Springer (2011)
Synthesis of Hierarchical Systems
Benjamin Aminofa, Fabio Mogaverob, and Aniello Muranob?
aHebrew University, Jerusalem 91904, Israel.
bUniversità degli Studi di Napoli “Federico II”, 80126 Napoli, Italy.
[email protected] {mogavero, murano}@na.infn.it
Abstract In automated synthesis, given a specification, we automatically cre- ate a system that is guaranteed to satisfy the specification. In the classical tem- poral synthesis algorithms, one usually creates a “flat” system “from scratch”.
However, real-life software and hardware systems are usually created using pre- existing libraries of reusable components, and are not “flat” since repeated sub- systems are described only once.
In this work we describe an algorithm for the synthesis of a hierarchical system from a library of hierarchical components, which follows the “bottom-up” ap- proach to system design. Our algorithm works by synthesizing in many rounds, when at each round the system designer provides the specification of the cur- rently desired module, which is then automatically synthesized using the initial library and the previously constructed modules. To ensure that the synthesized module actually takes advantage of the available high-level modules, we guide the algorithm by enforcing certain modularity criteria.
We show that the synthesis of a hierarchical system from a library of hierarchi- cal components is EXPTIME-complete forµ-calculus, and 2EXPTIME-complete for LTL, both in the cases of complete and incomplete information. Thus, in all cases, it is not harder than the classical synthesis problem (of synthesizing flat systems “from scratch”), even though the synthesized hierarchical system may be exponentially smaller than a flat one.
1 Introduction
Synthesis is the automated construction of a system from its specification. The ba- sic idea is simple and appealing: instead of developing a system and verifying that it is correct w.r.t. its specification, we use instead an automated procedure that, given a specification, constructs a system that is correct by construction. The first formulation of synthesis goes back to Church [7]; the modern approach to this problem was initiated by Pnueli and Rosner who introduced linear temporal logic (LTL) synthesis [23], later extended to handle branching-time specifications, such asµ-calculus [10].
The Pnueli and Rosner idea can be summarized as follows. Given setsΣI andΣO
of inputs and outputs, respectively (usually,ΣI=2I andΣO=2O, whereIis a set of input signals and O is a set of output signals), we can view a system as a strategy P:Σ∗I →ΣO that maps a finite sequence of sets of input signals into a set of output signals. WhenPinteracts with an environment that generates infinite input sequences, it associates with each input sequence an infinite computation overΣI∪ΣO. Though the systemPis deterministic, it induces a computation tree. The branches of the tree
?Partially supported by ESF GAMES grant 4112 and Vigevani Project Prize 2010-2011.
correspond to external nondeterminism, caused by different possible inputs. Thus, the tree has a fixed branching degree |ΣI|, and it embodies all the possible inputs (and hence also computations) ofP. When we synthesizePfrom an LTLspecificationϕ, we requireϕto hold in all the paths ofP’s computation tree. However, in order to impose possibility requirements onP, we have to use a branching-time logic likeµ-calculus.
Given a branching specification ϕ overΣI∪ΣO, realizability ofϕ is the problem of determining whether there exists a systemPwhose computation tree satisfiesϕ. Correct synthesis ofϕthen amounts to constructing such aP.
In spite of the rich theory developed for system synthesis in the last two decades, little of this theory has been reduced to practice. In fact, the main approaches to tackle synthesis in practice are either to use heuristics (e.g., [13]) or to restrict to simple spec- ifications (e.g., [22]). Some people argue that this is because the synthesis problem is very expensive compared to model-checking [16]. There is, however, something mis- leading in this perception: while the complexity of synthesis is given with respect to the specification only, the complexity of model-checking is given also with respect to a program, which can be very large. A common thread in almost all of the works concern- ing synthesis is the assumption that the system is to be built “from scratch”. Obviously, real-world systems are rarely constructed this way, but rather by utilizing many pre- existing reusable components, i.e., a library. Using standard preexisting components is sometimes unavoidable (for example, access to hardware resources is usually under the control of the operating system, which must be “reused”), and many times has other benefits (apart from saving time and effort, which may seem to be less of a problem in a setting of automatic - as opposed to manual - synthesis), such as maintaining a common code base, and abstracting away low level details that are already handled by the preexisting components. Another reason that may also account, at least partially, for the limited use of synthesis in practice, is the fact that synthesized systems are usually monolithic and look very unnatural from the system designer’s point of view. Indeed, in classical temporal synthesis algorithms, one usually creates a “flat” system, i.e., a system in which sub-systems may be repeated many times. On the contrary, real-life software and hardware systems are hierarchical (or even recursive) and repeated sub- systems (such as sub-routines) are described only once. While hierarchical systems may be exponentially more succinct than flat ones, it has been shown that the cost of solv- ing questions about them (like model-checking) are in many cases not exponentially higher [5, 6, 12]. Hierarchical systems can also be seen as a special case of recursive systems [2, 3], where the nesting of calls to sub-systems is bounded. However, having no bound on the nesting of calls gives rise to infinite-state systems, and this results in a higher complexity.
In this work we provide a uniform algorithm, for different temporal logics, for the synthesis of hierarchical systems (or, equivalently,transducers) from a library of hi- erarchical systems, which mimics the “bottom-up” approach to system design, where one builds a system by iteratively constructing new modules based on previously con- structed ones1. More specifically, we start the synthesis process by providing the algo- rithm with an initial libraryL0of available hierarchical components (as well as atomic ones). We then proceed by synthesizing in rounds. At each roundi, the system designer provides a specification formula ϕi of the currently desired hierarchical component,
1While for systems built from scratch, a top-down approach may be argued to be more suitable, we find the bottom-up approach to be more natural when synthesizing from a library.
which is then automatically synthesized using the currently available components as possible sub-components. Once a new component is synthesized, it is added to the li- brary to be used by subsequent iterations. We show that while hierarchical systems may be exponentially smaller than flat ones, the problem of synthesizing a hierarchical sys- tem from a library of existing hierarchical systems is EXPTIME-complete forµ-calculus, and 2EXPTIME-complete for LTL. Thus, this problem is not harder than the classical synthesis problem of flat systems “from scratch”. Furthermore, we show that this is true also in the case where the synthesized system has incomplete information about the environment’s input.
Observe that it is easily conceivable that if the initial libraryL0contains enough atomic components then the synthesis algorithm may use them exclusively, essentially producing a flat system. We thus have to direct the single-round synthesis algorithm in such a way that it produces modular and not flat results. The question of what makes a design more or less modular is very difficult to answer, and has received many (and often widely different) answers throughout the years (see [21], for a survey). We claim that some very natural modularity criteria are regular, and show how any criterion that can be checked by a parity tree automaton can be easily incorporated into our automata based synthesis algorithm.
Related work The issues of specification and correctness of modularly designed sys- tems have received a fair attention in the formal verification literature. Examples of important work on this subject are [8, 17, 26, 27]. On the other hand, the problem of automatic synthesis from reusable components, which we study here, has received much less attention. The closest to our work is Lustig and Vardi’s work on LTLsyn- thesis from libraries of (flat) transducers [18]. The technically most difficult part of our work is an algorithm for performing the synthesis step of a single round of the multiple- rounds algorithm. To this end, we use an automata-theoretic approach. However, unlike the classical approach of [23], we build an automaton whose input is not a computation tree, but rather a system description in the form of aconnectivity tree(inspired by the
“control-flow” trees of [18]), which describes how to connect library components in a way that satisfies the specification formula. Taken by itself, our single-round algorithm extends the “control-flow” synthesis work from [18] in four directions.(i)We consider not only LTLspecifications but also the modalµ-calculus. Hence, unlike [18], where co-Büchi tree automata were used, we have to use the more expressive parity tree au- tomata. Unfortunately, this is not simply a matter of changing the acceptance condition.
Indeed, in order to obtain an optimal upper bound, a widely different approach, which makes use of the machinery developed in [6] is needed.(ii)We need to be able to handle libraries of hierarchical transducers, whereas in [18] only libraries of flat transducers are considered.(iii)A synthesized transducer has no top-level exits (since it must be able to run on all possible input words), and thus, its ability to serve as a sub-transducer of another transducer (in future iterations of the multiple-rounds algorithm) is severely limited (it is like a function that never returns to its caller). We therefore need to ad- dress the problem of synthesizing exits for such transducers.(iv)As discussed above, we incorporate into the algorithm the enforcing of modularity criteria.
Recently, an extension of [18] appeared in [19], where the problem ofNested-Words Temporal Logic(NWTL) synthesis from recursive component libraries has been inves- tigated. NWTL extends LTL with special operators that allow one to handle “call and return” computations [1] and it is used in [19] to describe how the components have to be connected in the synthesis problem. We recall that in our framework the logic
does not drive (at least not explicitly) the way the components have to be connected.
Moreover, the approach used in [19] cannot be applied directly to the branching frame- work we consider in this paper, as we recall that already the satisfiability problem for µ-calculus with “call and return” is undecidable even for very restricted cases [4].
Due to lack of space some proofs are omitted and reported in a full version found at the authors’ web page.
2 Alternating Tree Automata
LetDbe a set. AD-treeis a prefix-closed subsetT ⊆D∗such that ifx·c∈T, where x∈D∗andc∈D, then alsox∈T. ThecompleteD-treeis the treeD∗. The elements ofT are callednodes, and the empty wordεis therootofT. Given a wordx=y·d, withy∈D∗andd∈D, we definelast(x)to bed. Forx∈T, the nodesx·d∈T, where d∈D, are the sonsof x. Aleaf is a node with no sons. ApathofT is a set π⊆T such thatε∈T and, for everyx∈π, eitherxis a leaf or there is a uniqued∈D such thatx·d∈π. For an alphabetΣ, aΣ-labeledD-treeis a pairhT,ViwhereT ⊆D∗is a D-tree andV:T→Σmaps each node ofT to a letter inΣ.
Alternating tree automataare a generalization of nondeterministic tree automata [20] (see [16], for more details). Intuitively, while a nondeterministic tree automaton that visits a node of the input tree sends exactly one copy of itself to each of the sons of the node, an alternating automaton can send several copies of itself to the same son.
An (asymmetric)Alternating Parity Tree Automaton(APT) is a tupleA=hΣ,D,Q, q0,δ,Fi, whereΣ,D, andQare non-empty finite sets ofinput letters,directions, and states, respectively;q0∈Qis aninitial state,F is aparity acceptance conditionto be defined later, andδ:Q×Σ7→B+(D×Q)is analternating transition function, which maps a state and an input letter to a positive boolean combination of elements inD×Q.
Given a setS⊆D×Qand a formulaθ∈B+(D×Q), we say thatSsatisfiesθ(denoted byS|=θ) if assigningtrueto elements inSandfalseto elements in(D×Q)\S, makes θtrue. Arunof an APTAon aΣ-labeledD-treeT =hT,Viis a(T×Q)-labeled IN-tree hTr,ri, where IN is the set of non-negative integers, such that(i) r(ε) = (ε,q0)and(ii)for ally∈Tr, withr(y) = (x,q), there exists a setS⊆D×Q, such thatS|=δ(q,V(x)), and there is one sony0ofy, withr(y0) = (x·d,q0), for every(d,q0)∈S. Given a nodexof a runhTr,ri, withr(x) = (z,q)∈T×Q, we definelast(r(y)) = (last(z),q). An alternating parity automatonA is nondeterministic (denoted NPT), iff when its transition relation is rewritten in disjunctive normal form each disjunct contains at most one element of {d} ×Q, for everyd∈D. An automaton is universal (denoted UPT) if all the formulas that appear in its transition relation are conjunctions of atoms inD×Q.
Asymmetricalternating parity tree automaton withε-moves (SAPT) [14] does not distinguish between the different sons of a node, and can send copies of itself only in a universal or an existential manner. Formally, an SAPT is a tupleA=hΣ,Q,q0,δ,Fi, whereΣis a finite input alphabet;Qis a finite set of states, partitioned into universal (Q∧), existential (Q∨),ε-and (Q(ε,∧)), andε-or (Q(ε,∨)) states (we also writeQ∨,∧= Q∨∪Q∧, andQε=Q(ε,∨)∪Q(ε,∧));q0∈Qis an initial state;δ:Q×Σ→(Q∪2Q)is a transition function such that for allσ∈Σ, we have thatδ(q,σ)∈Qforq∈Q∨,∧, and δ(q,σ)∈2Qforq∈Qε; andFis a parity acceptance condition, to be defined later. We assume thatQcontains in addition two special statesffandtt, calledrejecting sink andaccepting sink, respectively, such that∀a∈Σ:δ(tt,a) =tt,δ(ff,a) =ff. The classification offfandttis arbitrary. Transitions from states inQεlaunch copies ofA
that stay on the same input node as before the transition, while transitions from states inQ∨,∧ launch copies that advance to sons of the current node (note that for an SAPT the setDof directions of the input trees plays no role in the definition of a run). When a symmetric alternating tree automatonA runs on an input tree it starts with a copy in stateq0whose reading head points to the root of the tree. It then followsδin order to send further copies. For example, if a copy ofA that is in stateq∈Q(ε,∨)is reading a nodexlabeledσ, andδ(q,σ) ={q1,q2}, then this copy proceeds either to stateq1or to stateq2, and its reading head stays inx. As another example, ifq∈Q∧andδ(q,σ) =q1, thenA sends a copy in stateq1to every son ofx. Note that different copies ofA may have their reading head pointing to the same node of the input tree. Formally, arunof Aon aΣ-labeledD-treehT,Viis a(T×Q)-labeled IN-treehTr,ri. A node inTrlabeled by(x,q)describes a copy ofAin stateqthat reads the nodexofT. A run has to satisfy r(ε) = (ε,q0)and, for ally∈Tr withr(y) = (x,q), the following hold:
– Ifq∈Q∧ (resp.q∈Q∨) andδ(q,V(x)) =p, then for each son (resp. for exactly one son)x·dofx, there is a nodey·i∈Trwithr(y·i) = (x·d,p).
– Ifq∈Q(ε,∧) (resp.q∈Q(ε,∨)) andδ(q,V(x)) ={p0,...,pk}, then for alli∈ {0..k} (resp. for onei∈ {0..k}) the nodey·i∈Tr, andr(y·i) = (x,pi);
Aparity conditionis given by means of a coloring function on the set of states.
Formally, aparity conditionis a functionF:Q→C, whereC={Cmin,...,Cmax} ⊂IN is a set of colors. The size|C|ofCis called theindexof the automaton. For an SAPT, we also assume that the special statettis given an even color, andffis given an odd color. For an infinite pathπ⊆Trof a runhTr,ri, letmaxC(π)be the maximal color that appears infinitely often alongπ. Similarly, for a finite pathπ, we definemaxC(π)to be the maximal color that appears at least once inπ. An infinite pathπ⊆Tr satisfies the acceptance conditionFiffmaxC(π)is even. A runhTr,riis accepting iff all its infinite paths satisfyF. The automatonA accepts an input treehT,Viif there is an accepting run ofA onhT,Vi. The language ofA, denotedL(A), is the set ofΣ-labeledD-trees accepted byA. We say that an automatonAis nonempty iffL(A)6=/0.
A wide range of branching-time temporal logics can be translated to alternating tree automata (details can be found in [16]). In particular:
Theorem 1. [11, 16] Given a temporal-logic formulaϕ, it is possible to construct a SAPTAϕsuch thatL(Aϕ)is exactly the set of trees satisfyingϕ. Moreover, (i) ifϕis a µ-calculus formula, thenAϕis an alternating parity automaton with O(|ϕ|)states and index O(|ϕ|); and (ii) ifϕis anLTLformula, thenAϕis a universal parity automaton with2O(|ϕ|)states, and index2.
3 Hierarchical Transducers
In this section, we introducehierarchical transducers(alternatively,hierarchical Moore machines), which are a generalization of classical transducers in which repeated sub- structures (technically,sub-transducers) are specified only once. Technically, some of the states in a hierarchical transducer areboxes, in which inner hierarchical transducers are nested. Formally, a hierarchical transducer is a tuple K =hΣI,ΣO,hK1,...,Knii, whereΣI andΣO are respectively non-empty sets of input and output letters, and for every 1≤i≤n, the sub-transducerKi=hWi,Bi,ini,Exiti,τi,δi,Λiihas the following elements.
– Wi is a finite set of states.ini∈Wi is an initial state2, and Exiti⊆Wi is a set of exit-states. States inWi\Exitiare calledinternal states.
– A finite setBiofboxes. We assume thatW1,...,Wn,B1,...,Bnare pairwise disjoint.
– Anindexing functionτi:Bi→ {i+1,...,n} that maps each box of thei-th sub- transducer to a sub-transducer with an index greater thani. Ifτi(b) =jwe say that b referstoKj.
– Atransition function δi:(Sb∈Bi({b} ×Exitτi(b))∪(Wi\Exiti))×ΣI→Wi∪Bi. Thus, when the transducer is at an internal stateu∈(Wi\Exiti), or at an exiteof a boxb, and it reads an input letterσ∈ΣI, it moves either to a states∈Wi, or to a boxb0∈Bi. A move to a boxb0implicitly leads to the unique initial state of the sub-transducer thatb0refers to.
– Alabeling functionΛi:Wi→ΣOthat maps states to output letters.
The sub-transducerK1is called thetop-levelsub-transducer ofK. Thus, for exam- ple, the top-level boxes ofK are the elements ofB1, etc. We also callin1 the initial state ofK, andExit1theexitsofK. For technical convenience we sometimes refer to functions (like the transitions and labeling functions) as relations, and in particular, we consider /0to be a function with an empty domain. Note that the fact that boxes can refer only to sub-transducers of a greater index implies that the nesting depth of trans- ducers is finite. In contrast, in therecursivesetting such a restriction does not exist.
Also note that moves from an exite∈Exiti of a sub-transducer Ki are not specified by the transition functionδi of Ki, but rather by the transition functions of the sub- transducers that contain boxes that refer toKi. The exits ofK allow us to use it as a sub- transducer of another hierarchical transducer. When we say that a hierarchical trans- ducerK =hΣI,ΣO,hK1,...,Kniiis a sub-transducer of another hierarchical transducer K0=hΣI,ΣO,hK10,...,Kn00ii, we mean that{K1,...,Kn} ⊆ {K20,...,Kn00}. The size|Ki|of a sub-transducerKi is the sum|Wi|+|Bi|+|δi|. The size|K|ofK is the sum of the sizes of its sub-transducers. We sometimes abuse notation and refer to thehierarchical transducerKiwhich is formally the hierarchical transducerhΣI,ΣO,hKi,Ki+1,...,Knii obtained by takingKito be the top-level sub-transducer.
Flat transducers A sub-transducer without boxes isflat. A hierarchical transducer K =hΣI,ΣO,hW,/0,in,Exit,/0,δ,Λiiwith a single (hence flat) sub-transducer is flat, and we denote it using the shorter notationK =hΣI,ΣO,hW,in,Exit,δ,Λii. Each hierarchi- cal transducerK can be transformed into an equivalent flat transducerKf=hΣI,ΣO, hWf,in1,Exit1,δf,Λfii(called itsflat expansion) by recursively substituting each box by a copy of the sub-transducer it refers to. Since different boxes can refer to the same sub-transducer, states may appear in different contexts. In order to obtain unique names for states in the flat expansion, we prefix each copy of a sub-transducer’s state by the sequence of boxes through which it is reached. Thus, a state(b0,...,bk,w)ofKfis a vector whose last component wis a state in ∪ni=1Wi, and the remaining components (b0,...,bk)are boxes that describe its context. The labeling of a state(b0,...,bk,w)is de- termined by its last componentw. For simplicity, we refer to vectors of length one as elements (that is, w, rather than (w)).3 Formally, given a hierarchical transducer
2We assume a single entry for each sub-transducer. Multiple entries can be handled by dupli- cating sub-transducers.
3A helpful way to think about this is using a stack — the boxesb0,...,bkare pushed into the stack whenever a sub-transducer is called, and are popped in the corresponding exit.
K =hΣI,ΣO,hK1,...,Knii, for each sub-transducerKi=hWi,Bi,ini,Exiti,τi,δi,Λiiwe inductively define its flat expansionKif=hWif,ini,Exiti,δfi,Λfiias follows.
– The set of statesWif⊆Wi∪(Bi×(Snj=i+1Wjf))is defined as follows: (i) ifwis a state ofWithenwbelongs toWif; and (ii) ifbis a box ofKiwithτi(b) =j, and the tuple(u1,...,uh)is a state inWfj, then(b,u1,...,uh)belongs toWif.
– The transition functionδfi is defined as follows: (i) Ifδi(u,σ) =v, whereu∈Wi, or u= (b,e)withb∈Biande∈Exitτi(b), then ifvis a state, we have thatδfi(u,σ) =v;
and ifvis a box, we have thatδfi(u,σ) = (v,inτi(v)). Note that(v,inτi(v))is indeed a state ofWif by the second item in the definition of states above; and (ii) ifbis a box ofKi, andδfτi(b)((u1,...,uh),σ) = (v1,...,vh0)is a transition of Kτfi(b), then δfi((b,u1,...,uh),σ) = (b,v1,...,vh0)is a transition ofKif.
– Finally, ifu∈WithenΛfi(u) =Λi(u); and ifu∈Wifis of the formu= (b,u1,...,uh), whereb∈Bi, thenΛi(u) =Λfτi(b)(u1,...,uh).
The transducerhΣI,ΣO,hK1fiiis the required flat expansionKfofK. Anatomic trans- duceris a flat transducer made up of a single node that serves as both an entry and an exit. For each letterς∈ΣOthere is an atomic transducerKς=h{p},p,{p},/0,{(p,ς)}i whose single statepis labeled byς.
Run of a hierarchical transducer Consider a hierarchical transducerK withExit1= /0that interacts with its environment. At point jin time, the environment providesK with an inputσj∈ΣI, and in responseK moves to a new state, according to its transition relation, and outputs the label of that state. The result of this infinite interaction is a computation ofK, called thetraceof the run ofK on the wordσ1·σ2···. In the case thatExit16=/0, the interaction comes to a halt wheneverK reaches an exite∈Exit1, since top-level exits have no outgoing transitions. Formally, a runof a hierarchical transducerK is defined by means of its flat expansionKf. Given a finite input word v=σ1···σm∈Σ∗I, arun(computation) ofK onvis a sequence of statesr=r0···rm∈ (Wf)∗such thatr0=in1, andrj=δf(rj−1,σj), for all 0<j≤m. Note that sinceK is deterministic it has at most one run on every word, and that ifExit16=/0thenK may not have a run on some words. Thetraceof the run ofK onvis the word of inputs and outputs trc(K,v) = (Λf(r1),σ1)···(Λf(rm),σm)∈(ΣO×ΣI)∗. The notions of traces and runs are extended to infinite words in the natural way.
The computations ofK can be described by acomputation treewhose branches correspond to the runs ofK on all possible inputs, and whose labeling gives the traces of these runs. Note that the root of the tree corresponds to the empty wordε, and its labeling is not part of any trace. However, if we look at the computation tree ofK as a sub-tree of a computation tree of a transducerK0 of whichK is a sub-transducer, then the labeling of the root of the computation tree ofK is meaningful, and it corre- sponds to the last element in the trace of the run ofK0 leading to the initial state of K. Formally, givenσ∈ΣI, the computation treeTK,σ=hTK,σ,VK,σi, is a(ΣO×ΣI)- labeled (Wf×ΣI)-tree, where: (i) the root ε is labeled by (Λf(in1),σ); (ii) a node y= (r1,σ1)···(rm,σm)∈(Wf×ΣI)+ is in TK,σ iff in1·r1···rm is the run ofK on v=σ1···σm, and its label isVK,σ(y) = (Λf(rm),σm). Thus, for a nodey, the labels of the nodes on the path from the root (excluding the root) toyare exactly trc(K,v). Ob- serve that the leaves ofTK,σcorrespond to pairs(e,σ0), wheree∈Exit1andσ0∈ΣI.