June 2006
Lill Kristiansen, ITEM Egil Østhus, TANDBERG
Master of Science in Communication Technology
Submission date:
Supervisor:
Co-supervisor:
Norwegian University of Science and Technology Department of Telematics
Self-Configurating Computer Systems
Jahn Arne Johnsen
The objective of this master’s thesis is to asses how one can utilise ontologies encoded using the Web Ontology Language (OWL), possibly alongside with other Semantic Web technologies, to do analysis of computing systems with the goal of classifying error scenarios. The thesis should also review how one can further use these technologies to specify which actions the unit should take if any of these pre-specified situations occur (i.e. a variant of "ontology based self-configuring"). The aim is to provide a solid theoretical foundation which later works and realisations can build on.
Although the technology could be used by several domains or systems, the H.323 domain will be used as an ongoing example both in scenarios and later modelling and feasibility testing. More specifically, the thesis limits its scope to the domain’s solvable software configuration problems, thereby leaving a feasibility of use study for hardware issues for future work. The H.323 domain is chosen much due to its documentation accessibility as well as an easy access to domain experts at TANDBERG R&D.
Assignment given: 20. January 2006 Supervisor: Lill Kristiansen, ITEM
To live is to battle with trolls in the vaults of the heart and brain.
To write: that is to sit in judgment over one’s self
- Ibsen
Abstract
The explosive growth, in both the size and complexity, of communication and computing systems has made it increasingly difficult to manage and configure these. In today’s ever- changing environments, systems frequently need to adapt and reconfigure their components to suit the restrictions imposed on them by the same environment. However, such configuration tasks may be quite time consuming and is often suited only for trained personnel. If the complexity continues to grow, it will at some point grow beyond any human’s ability to manage. Consequently, there have recently been great interest in making computer systems more autonomic, thus giving them the ability to configure themselves.
The purpose of this master’s thesis was to assess how one could create such self-configuring systems using Semantic Web technologies to classify and recover from pre-defined error scenarios. The Semantic Web is an effort to give meaning to information in a machine- readable way, through the use of a knowledge-representation technique known as ontologies.
Ontologies are formal and explicit specifications of shared conceptualisations [103] that can be used to model and represent an entire domain, including its units and relations. This is useful in self-configuring systems as such world models are considered good starting points for reasoning as well as error solutions [3].
The thesis describes how one can utilise these ontologies encoded using the Web Ontology Language (OWL) combined with another Semantic Web technology, the Semantic Web Rule Language (SWRL), in order to classify the pre-defined error scenarios. This classification is suggested realised using three separate steps: first, the given domain is modelled and an OWL-encoded ontology is created, then necessary unit information is inserted into the ontology by the use of ontology instances. Finally, SWRL’s horn-style rules are used to determine which error scenario the given situation should be classified as. It is further suggested using an event-based monitoring solution to decide when such a classification is needed. The thesis also shows how one can use the OWL service specification ontology, OWL-S, to specify and describe which actions a failed unit should take in order to recover from an error.
In addition to a thorough theoretical assessment on how Semantic Web technologies could be used in self-configuring systems, an architecture design for the realisation of such a system is proposed. The proposal includes block diagrams, state machines describing functionality, and message sequence charts. All conforms to the UML2.0 standard. The use of such formal modelling languages allows for easy translation into an executable framework to which the more specific functionality may be added (including OWL-S execution etc).
To test the feasibility of the suggested approach, two experiments were designed and conducted. These set out to test whether SWRL rules could classify error scenarios and if it was possible to execute repair procedures encoded in OWL-S. Although some of the sub-tests failed due to the immaturity of the Semantic Web field and its technologies, the approach seemed to be promising. A complete realisation of the architecture will however require that underlying technology issues are resolved.
Preface
This master’s thesis was completed as part of the Master of Technology programme at the Norwegian University of Technology and Science (NTNU). It was carried out at the Department of Telematics in cooperation with TANDBERG ASA in the period January 2005 till June 2005.
I would like to thank my guidance professor Lill Kristiansen for helpful ideas, thoughts and suggestions throughout this thesis. My supervisor at TANDBERG R&D, Egil Østhus, also deserves many words of acknowledgement for proposing such a difficult and demanding, but in the end, very rewarding research project as this thesis turned out to be. I would further like to express gratitude to the both of them for encouraging words during times where things did not go as originally planned. Thank you. In the words of Woody Allen:
“If you’re not failing every now and again, it’s a sign you're not doing anything very innovative.”
Finally, I would like to thank my brother, Terje Johnsen, as well as fellow students, Trond Øivind Eriksen, Lars Erik Karlsen and Jonny Mauland, for extensive proof reading and last minute comments in the final process of this report.
Trondheim, June 2006
Jahn Arne Johnsen
Abbreviations and Thesaurus
Conceptualisation Description of the concepts and relations that can exist in a given domain.
Consistency Used in relations to the ontology concept, consistency is whether structural constraints are met and if there is contradictive information.
EXPTIME Used for decision problems that can be solved by a deterministic sequential machine in an amount of time that is polynomial function of the problem size
Decidable / Undecidable
A decision problem is said to be is decidable if one can construct an algorithm which terminates after a finite amount of time. Similarly, the problem is known as undecidable if no algorithm can decide it.
Declarative A computer program is called declarative if it describes what is, rather than how to create it. E.g. a collection of rules, axioms etc.
which can be used to solve a reasoning problem.
Model An abstract representation of a domain.
Monotonic In a monotonic program, adding new knowledge will not cause the retraction of previous conclusions.
NEXPTIME Decision problems whose solution can be found in polynomial time on a non-deterministic machine.
Ontology A formal and explicit specification of a shared conceptualisation.
OWL Web Ontology Language
URI Unified Resource Identification
URL Unified Resource Locator
RDF Resource Description Framework
SWRL Semantic Web Rule Language
Taxonomy A class hierarchy
XML Extensible Markup Language
Table of contents
ABSTRACT... 3
PREFACE... 5
ABBREVIATIONS AND THESAURUS ... 6
TABLE OF CONTENTS... 7
TABLE OF FIGURES ...10
INDEX OF TABLES ...12
1 INTRODUCTION ... 1
1.1 MOTIVATION... 1
1.2 RESEARCH STATEMENT... 2
1.3 SCENARIOS... 2
1.3.1 Scenario 1 – Today... 2
1.3.2 Scenario 1 – Near Future ... 3
1.4 RELATED WORK... 3
1.5 THESIS OUTLINE... 4
2 THE H.323 STANDARD ... 6
2.1 COMPONENTS... 6
2.1.1 Terminal ... 6
2.1.2 Gatekeeper ... 7
2.1.3 Gateway... 8
2.1.4 Multiple Control Units ... 8
2.2 SECURITY... 8
3 AUTONOMIC COMPUTING ... 9
3.1 DEFINITION AND CHARACTERISTICS... 9
3.2 ARCHITECTURE...10
3.2.1 Knowledge...11
3.2.2 Monitor...12
3.2.3 Analyse ...12
3.2.4 Plan ...12
3.2.5 Execute ...13
4 ONTOLOGIES ...14
4.1 DEFINITION OF ONTOLOGIES...14
4.2 REASONS FOR USE...15
4.3 GENERAL STRUCTURE OF AN ONTOLOGY...16
4.3.1 Concepts / Classes...16
4.3.2 Slots / Properties ...17
4.3.3 Restrictions / Facets ...18
4.3.4 Individuals / Instances...18
5 SEMANTIC WEB TECHNOLOGIES...19
5.1 THE SEMANTIC WEB...19
5.1.1 Proposed Architecture...20
5.2 DESCRIPTION LOGICS...23
5.2.1 DL Families...25
5.2.2 The Open World Assumption...25
5.2.3 DL Reasoning...26
5.3 THE WEB ONTOLOGY LANGUAGE –OWL...27
5.3.1 OWL Full...27
5.3.2 OWL DL ...28
5.3.3 OWL Lite ...28
5.4 SWRL-SEMANTIC WEB RULE LANGUAGE...28
5.5 OWL-S:SERVICE DESCRIPTIONS USING OWL ...30
5.5.1 The OWL-S Service Process Model...31
6 THESIS METHODOLOGIES ...32
6.1 LITERATURE REVIEW...32
6.2 SOFTWARE ENGINEERING...33
6.3 ONTOLOGICAL ENGINEERING...36
6.3.1 Ontology Purpose and Scope ...37
6.3.2 Knowledge Acquisition and Conceptualisation...37
6.3.3 Ontology Integration ...37
6.3.4 Implementation...38
6.3.5 Evaluation & Documentation...38
7 OSCOS – ONTOLOGY BASED SELF-CONFIGURATION SYSTEM...39
7.1 SYSTEM DESIGN...39
7.1.1 Choosing a Principle for System Monitoring ...39
7.1.2 Making Units More Self-Aware...40
7.1.3 Unit Ontology Maintenance ...41
7.1.4 Choosing a Suitable OWL Dialect ...42
7.1.5 Ontology Interoperability Approaches...42
7.1.6 Classifying Error-Scenarios...43
7.1.7 Resolving Error-Scenarios ...44
7.1.8 Proposed Architecture...44
8 EXPERIMENTS...59
8.1 DOMAIN ONTOLOGY DESIGN...59
8.1.1 Top Level Concepts ...59
8.1.2 Important N-ary Relations...64
8.2 PROCESS ONTOLOGY DESIGN...65
8.3 EXPERIMENT DESCRIPTIONS...70
8.3.1 Test Case 1 ...71
8.3.2 Test Case 2 ...73
8.3.3 Test Case 3 ...74
8.4 EXPERIMENT RESULTS...75
8.4.1 Experiment 1 – Complete Rule Base Firing ...75
8.4.2 Experiment 2 – OWL-S Execution...76
9 EVALUATION ...77
9.1 SCALABILITY...77
9.2 RULE-BASED SYSTEMS...78
9.3 TECHNOLOGY ISSUES...79
9.3.1 OWL Reasoning ...79
9.3.2 Limitations in OWL Expressivity...79
9.3.3 SWRL Reasoning ...80
9.3.4 OWL-S Process Models...81
9.3.5 Ontology Inconsistencies...81
10 CONCLUSIONS AND FUTURE WORK ...83
10.1 CONTRIBUTIONS...83
10.1.1 Theoretical Foundation for Semantic Web Driven Self-configuration ...84
10.1.2 Proposed Architecture...84
10.1.3 Feasibility Studies ...84
10.1.4 H.323 Ontologies...85
10.2 FUTURE WORK...85
11 REFERENCES ...87
A APPENDIX ...97
A.1 ONTOLOGY BROWSING AND EDITING RELATED...97
A.1.1 How to Install Protégé ...97
A.1.2 How to Open Ontology in Protégé ...98
A.1.3 How to Install Pellet OWL Reasoner ...99
A.1.4 How to Browse Ontology in OWLDoc ...100
A.2 ATTACHED E-MAILS...101
A.3 MORE ON THE H.323STANDARD...102
A.3.1 RAS Messages ...102
A.3.2 H.323 Message Sequence Charts ...103
A.3.3 Codecs ...106
Table of figures
Figure 1-1 – Architecture of an ontology-based management system [111]...4
Figure 2-1 – Components in the H.323 Architecture ...6
Figure 3-1 – The autonomic control loop [61]...11
Figure 4-1 – An example ontology of animals ...17
Figure 4-2 - Example of a n-ary relation [85] ...18
Figure 5-1 - The Layered Architecture of the Semantic Web [10] ...21
Figure 5-2 - RDF statement triple ...22
Figure 5-3 – OWL-S ontology [26]...30
Figure 5-4 – OWL-S model of an H.323 Endpoint’s UnregisterWithGatekeeper procedure ..31
Figure 6-1 – The five different phases of the master’s thesis research ...32
Figure 6-2 – Three viewpoints of reality [16]...34
Figure 6-3 – Differences in where effort between the elaboration and the translation approach ...35
Figure 6-4 – The Uschold and Gruninger ontological engineering methodology [110] ...36
Figure 6-5 - The different kinds of ontologies, according to their level of dependence [48]...38
Figure 7-1 – Structure when using the single ontology standardisation approach...42
Figure 7-2 – Ontology structure when applying a interlingua approach ...43
Figure 7-3 – OSCOS System Components...45
Figure 7-4 – Composition of the TerminalAgent component...45
Figure 7-5 – The state machine describing the EventHandlerManager ...46
Figure 7-6 – The EventHandlingSession which is created to handle a event...47
Figure 7-7 – Message Sequence Diagram for a successful handling of a reported error ...48
Figure 7-8 – The OntologyMapperManager’s state machine ...49
Figure 7-9 – MSC showing how a component registers with the OntologyMapper...50
Figure 7-10 – The OntologyObjectMapper which keeps the component ontology up-to-date 51 Figure 7-11 –State machine of the SelfReconfigurationEngine ...52
Figure 7-12 – The OSCOS Repair Service...53
Figure 7-13 – The RepairManager’s state machine ...53
Figure 7-14 – Part one of the behavioural state machine of the RepairRequestHandler ...54
Figure 7-15 – Sequence diagram showing a successful request for a repair procedure...55
Figure 7-16 - Part two of the behavioural state machine of the RepairRequestHandler...56
Figure 7-17 – The contents of the Ontology Repository ...57
Figure 7-18 – The general state machine for the components of the OntologyRepository...57
Figure 7-19 – The extension added to construct the RuleBase component...58
Figure 8-1 – The top-level of the engineered ontology ...59
Figure 8-2 – The Codec concept representing different computer codecs ...60
Figure 8-3 – The Communication_Device concept is at the heart of the proposed ontology ..61
Figure 8-4 – The Computer Network concept with its sub-concepts...61
Figure 8-5 – The Document concept ...62
Figure 8-6 – The Relation concept which represents n-ary relations in the ontology ...63
Figure 8-7 – The Unique_ID concept with its three sub-concepts...63
Figure 8-8 – The Unit_Network_Registration representing a device’s registration to a network ...64
Figure 8-9 – The Endpoint_Call which represents a call between two H.323_Endpoints...65
Figure 8-10 – Endpoint_Registration, representing a H.323 unit’s registration to a gatekeeper ...65
Figure 8-11 – Process ontology for the H.323 Terminal with its atomic and composite services ...69
Figure 8-12 – Topology used for test cases ...70
Figure 8-13 – Repair process for the first case scenario specified in OWL-S...72
Figure 8-14 – Self repair process believed to correct the second case scenario ...73
Figure 8-15 – Recovery process for test case 3 specified in OWL-S...75
Figure A-1 – Choosing which components of Protégé that should be installed ...97
Figure A-2 - Showing which ontology file to open ...98
Figure A-3 – Ontology opened in Protégé...99
Figure A-4 – Navigating Protégé menus ...100
Figure A-5 – Screenshot of web browser showing OWLDoc generated files...101
Figure A-6 – H.323 Gatekeeper discovery ...103
Figure A-7 – H.323 Endpoint Registration...104
Figure A-8 – H.323 Endpoint unregistration ...104
Figure A-9 - Direct Endpoint Call signalling ...105
Figure A-10 – Location Request ...105
Index of tables
Table 2-1 – An H.323 gatekeeper’s required and optional services [25] ...7
Table 5-1 – DL Operators and connectives ...24
Table 5-2 –DL Extensions [8]...25
Table 5-3 - Set of relations between OWL languages [77] ...27
Table 5-4 - SWRL rule atoms [53]...29
Table 8-1 – Identified atomic and composite services of a basic H.323 Terminal ...68
Table 8-2 – SWRL variables...71
Table 8-3 – SWRL rules used for test case 1...72
Table 8-4 – SWRL rules used for test case 2...73
Table 8-5 – SWRL rules used for test case 3...74
Table 8-6 – Results from testing of SWRL on the domain ontology...75
Table 8-7 – The results from experiments on OWL-S parsing...76
Table 12-1 – E-mail from the SWRL plugin developer regarding built-in functions...102
Table 12-2 – Messages used in the Registration Admission Status standard [15] ...103
Table 12-3 – Overview of important audio, security and video codecs [15]...106
1 Introduction
This chapter will present and elaborate on the motivation for this thesis, as well as present scenarios, a research statement and finally a summary of known related work.
1.1 Motivation
Throughout history, advances in technology have brought on many revolutionary changes.
During the last few decades, the world has seen the advent of the personal computer, the breakthrough of the Internet and the introduction of mobile telephony. From the evolution of single machines into today’s huge networks of personal and mainframe computers, an unmistakable pattern emerges: computer and information technology have grown at almost exponential rates and have had an incredible progress in almost every aspect. The architectures of computer and telecommunication systems have become increasingly sophisticated, and are now governed by literally millions of code lines which have been put together by thousands of programmers. By interconnecting these systems through networks, we are adding yet another layer of complexity, but vast possibilities are provided to the users.
As the complexity of modern communication systems increases, so does the need for proper configuration and management. In an ever-changing environment, these systems frequently need to adapt by performing actions and (re)configuration of their services and workflow.
Such system configuration and management can be a time consuming and error-prone process which is already today is often suited only for skilled IT personnel. In the future, not only will there be a shortage for people with the right knowledge and skills to manage computer and telecommunications systems, but the complexity it self is growing beyond any human’s ability to manage it. One can only imagine the impact on users having to go through three thousand manual pages in order to install and use the newly acquired conferencing system. It is therefore becoming increasingly desirable to design computer systems that are able to configure themselves. According to Horn [53]:
“It’s time to design and build computing systems capable of running themselves, adjusting to varying circumstances, and preparing their resources to handle most efficiently the workloads we put upon them.” [53]
By attempting to accomplish this self-management by a computing model analogous to the human autonomic nervous system, this approach has been named Autonomic Computing [99]. Autonomic Computing helps address the growing complexity by using technology to manage technology [61]. It accomplishes its functions by taking an appropriate action based on one or more situations which they sense in their environment. For an autonomous system to behave appropriately in an uncertain environment, it has been suggested that:
“The system must have an internal representation (world model) of what it feels and experiences as it perceives entities, events, and situations in the world. It must have an internal model that captures the richness of what it knows and learns, and a mechanism for computing values and priorities that enables it to decide what it wishes to do.”[3]
This is in accordance with Eracar [31], which states that a key factor for achieving self- awareness, an essential element of any autonomic system, is self-modelling. While there is a
large body of work in various knowledge representation and ontology areas, relatively little has been applied to the area of world modelling in autonomous systems. It would thus be desirable to research ways which such formal world models can be used in autonomic systems.
1.2 Research Statement
The objective of this master’s thesis is to asses how one can utilise ontologies encoded using the Web Ontology Language (OWL), possibly alongside with other Semantic Web technologies, to do analysis of computing systems with the goal of classifying error scenarios.
The thesis should also review how one can further use these technologies to specify which actions the unit should take if any of these pre-specified situations occur (i.e. a variant of
"ontology based self-configuring"). The aim is to provide a solid theoretical foundation which later works and realisations can build on.
Although the technology could be used by several domains or systems, the H.323 domain will be used as an ongoing example both in scenarios and later modelling and feasibility testing.
More specifically, the thesis limits its scope to the domain’s solvable software configuration problems, thereby leaving a feasibility of use study for hardware issues for future work. The H.323 domain is chosen much due to its documentation accessibility as well as an easy access to domain experts at TANDBERG R&D.
1.3 Scenarios
In [82], Muller states that:
“(…) we think of a scenario as a story that one person (or group of people) can tell to another, and that describes human work, human collaboration or human activities (with or without computers)”. [82]
This is similar to Regnell et al. which state in [91] that the purpose of scenarios is to give a partial description of system usage as seen by its users [91]. Thus, the scenarios presented in this section will be used as a mean to justify and explain the technology for the reader, as well as to obtain goals and a purpose for a desired system. The two different scenarios are in essence based on one single story presented with two separate variations. Scenario 1 present a plausible current situation of a given user, while the scenario 2 shows a different outcome of the situation based on the desired technology which will be researched throughout this thesis.
1.3.1 Scenario 1 – Today
Alice, a project manager for a large telecommunication firm, is about to attend a meeting using her brand new video conferencing system. Sitting down at her desk in her home office, she starts dialling the number which she received by email from the secretary at her client’s firm. After the number has been entered into her video conferencing unit, nothing happens.
The unit merely states “Call setup failed, please try again”. Alice tries again, but with the same result. Since this is Alice’s first attempt to use the unit on her own, she decides to try to get in touch with the IT personnel in the Help department at her firm. After a while, she finally gets hold of Bob, who is head of the company’s video conferencing project. As Bob
takes her through the configuration steps, he discovers that her unit is not registered properly at any video gatekeepers. He looks up the needed gatekeeper id and tells Alice to type it in using the configuration manager. After some time, Alice manages to register with the gatekeeper and is ready to attend the video conference with her client. By now, Alice is quite frustrated with the equipment, half an hour late, and feels a bit amateurish as she calls the client.
1.3.2 Scenario 1 – Near Future
Alice, a project manager for a large telecommunication firm, is about to attend a meeting using her brand new video conferencing system. Sitting down at her desk in her home office, she starts dialling the number which she received by email from the secretary at her client’s firm. Once the number has been entered into her video conferencing unit, nothing happens, but the unit states “Call setup failed, please wait while the unit tries to resolve the problem”.
After a few seconds, the screen shifts to state “Problem resolved, redialling”. Although Alice did not quite understand what just happened, she appreciated the fact that the unit seemed to fix itself and goes on with her scheduled meeting.
1.4 Related Work
Although there are, to the author’s knowledge, no published research on a Semantic Web driven approach to self-configuring system, some related previous work can be found.
The problem of configuring complex computer systems has existed for years and a large number of management standards such as the Simple Network Management Protocol (SNMP), the Desktop Management Interface (DMI) and the Telecommunications
Management Network (TMN) have been proposed [76]. Common for these are the definition of protocols for exchanging information between the managing and the managed computing devices as well as definition of basic manipulation functionality such as information access, addition, editing and deletion. Unfortunately, the various initiatives all define their own resource models and protocols without any coordination between them, resulting in severe interoperability issues, as well as incompatibility between concepts and terminologies [108].
In order to overcome these issues, the Web Based Enterprise Management (WBEM)
initiative, developed by the Distributed Management Task Force (DMTF), was introduced in 1996. The WBEM is a standard independent of the underlying platforms and resources. It defines management architecture, management protocol, management schema, and object manager. It uses a schema called the Common Information Model (CIM) to represent its entities and managing information. The CIM schema provides definitions for servers, desktops, peripherals etc. Its object oriented approach also makes it easier to track relationships between managed objects [108]. However, even with this new approach,
interoperability between different management schemes is still cumbersome and dependent on separate translation approaches. In addition, there are situations where policies are not
transferable between the different management domains [111].
In [111], Vergara et al. address this issue and state that management policies should be generic and independent of the models used to define the resources. In order to accomplish this, they suggested using a common ontology to define the domains, and on this ontology
specify policies that should be enforced on the system. The policies are suggested to be represented through the use of a predicate logic language such as KIF or OCL [111]. These policies would then be applied to the different sub-management systems through specialised gateways. Its architecture is shown in Figure 1-1.
Figure 1-1 – Architecture of an ontology-based management system [111]
The ontology and policy technology components of this initial architecture have later been suggested replaced by Guerrero et al. in [49]. Here they propose the use of the Web Ontology Language (OWL) to represent the ontology and to further use the Semantic Web Rule
Language (SWRL) to express the different management policies, making it a Semantic Web driven approach [49]. However, any complete architecture is suggested, and neither are behavioural models or any general logic for the system. Also, there are no empirical attempts to show the feasibility or soundness of the proposal.
1.5 Thesis Outline
This thesis has been divided into 11 chapters and one appendix. In the following the different chapters will be briefly described.
The H.323 standard provides a solid foundation for multimedia conferencing over packet switched networks. Since the H.323 domain is used as an on-going example in this thesis, the second chapter provides a brief introduction to the technology. This includes its architecture as well as main components.
The third chapter presents the concept of autonomic computing. Autonomic computing has been inspired by the human autonomic nervous system and is aimed at designing computer systems that are self-managing [99]. The chapter further explain and elaborate on the important characteristics that an autonomic system need to possess as well as present the control architecture needed for such systems.
Chapter four is used to discuss one of the central concepts of this thesis; the ontology. An ontology is a “formal and explicit specification of a shared conceptualisation” [103]. It begins with discussing reasons for use, before the general structure and properties are presented.
In the fifth chapter, the Semantic Web initiative and its technologies are presented in detail.
The chapter starts out by justifying the need for a new Semantic Web, before one of the enabling technologies for its knowledge representation formats, description logics, is presented. In the remainder of the chapter, the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL) and the OWL service description ontology (OWL-S) are described.
These first five chapters are meant to provide the reader with an introduction to the necessary background knowledge so that the reasoning and thoughts in the remainder of the thesis are easily understood.
The sixth chapter present several methodologies which have been used to drive the research through the different phases of the thesis. Specifically, methods for literature reviews, system engineering and ontology engineering are described.
In chapter seven, the proposed architecture for a Semantic Web driven self-configuring system is presented alongside a thorough discussion of each of the necessary parts for realising such a system. This includes choosing how units should perform self-modelling as well as how they can classify error scenarios and recover from these. The proposed
architecture is at the end of the chapter presented through the use of block diagrams, state machines explaining its behaviour and sequence diagrams which ease the general
understanding and flow of the system.
Chapter eight contains experiments designed to test the feasibility of the approaches proposed in chapter seven. As a prerequisite to these experiments, the H.323 domain had to be
modelled and an OWL ontology of the domain is thus designed and presented. In order to specify which actions a unit needs to take in order to recover from a given error scenario, the services of a H.323 Terminal was also modelled using the OWL-S ontology.
In the ninth chapter of this thesis, an evaluation of the proposed architecture is given. This includes a discussion of its scalability, the architectural implications of using a rule-based way to classify the error scenarios as well as underlying technology weaknesses and limitations.
Finally the tenth, and last chapter, contains the conclusions of this thesis, which includes a presentation of the main contributions to the research area and suggestions for future work.
2 The H.323 Standard
H.323 is an umbrella standard which references many other ITU-T recommendations. It basically aggregates a set of standards for multimedia conferencing over packet switched networks such as the Internet [106]. As it is used for the scenarios and case throughout the thesis, this chapter will give a brief overview over the standard, its architecture and main components.
2.1 Components
H.323 Terminal
H.323 Gatekeeper H.323 Terminal H.323 Terminal H.323 MCU
H.322 Terminal V.70
Terminal
H.324 Terminal
Speech Terminal
Speech Terminal
H.320 Terminal
H.321 Terminal
H.321 Terminal H.310 terminal
operating in H.321 mode
GSTN QOS LAN N-ISDN B-ISDN
H.323 Gateway
Figure 2-1 – Components in the H.323 Architecture
Shown in the top of Figure 2-1 are the major network components of the H.323 architecture:
the mandatory terminal and the optional multipoint control unit (MCU), the gatekeeper and the gateway [106], [107]. Each of these will be further explained in the following.
2.1.1 Terminal
H.323 terminals are required support real-time two-way communication with other H.323 components through the use of some minimum signalling protocols (H.245, Q.931, and RAS) and audio codecs (minimum G.711). Optionally they can support video (at least QCIF H.261) and data communications (typically T.120) [107]. If the terminal does not support these minimum standards, the endpoint is not considered an H.323 standards-compliant endpoint [106]. The terminal is typically identified by H.323 IDs, which are arbitrary, case-sensitive
text strings, or E.164 aliases, which are basically telephone numbers that may be local or global [106].
2.1.2 Gatekeeper
Although the gatekeeper is an optional element in an H.323 network, it is still a rather important component. It is basically responsible for managing all the other entities of an H.323 network (also known as a H.323 zone) and may provide many services such as call control [107]. In Table 2-1, the required and optional functions provided by the H.323 gatekeeper, is shown.
Required functions
Address Translation Translation of E.164 aliases or H.323 IDs to IP addresses using a table typically updated with registration messages.
Admission Control Authorisation of LAN access based on call authorisation, bandwidth or other criteria. Support of Admission Request, Confirm and Reject (ARQ, ACF, ARJ) Bandwidth Control Support for Bandwidth Request, Confirm
and Reject (BRQ, BCF, BRJ). This may be based on bandwidth management.
Zone Management The gatekeeper will provide the
functions mentioned to terminals, MCUs and gateways which has registered within its control zone.
Optional functions
Call-control signalling In a point-to-point conference, the may process call control signals or,
alternatively, send them directly to each other
Call authorisation The gatekeeper may reject a call from a terminal based on restricted access to particular terminals or gateways or restricted access during a certain period of time.
Bandwidth management Places a limit on the amount of
bandwidth the terminal may use on the network.
Call management The gatekeeper may maintain a list of ongoing calls in order to indicate that a called terminal is busy or to provide information for the Bandwidth management function.
Table 2-1 – An H.323 gatekeeper’s required and optional services [25]
2.1.3 Gateway
The gateway provides a support for interoperability with other communication networks as shown in the bottom part of Figure 2-1. More specifically H.323 gateways provide the following functionalities [64]:
• Translation between transmission formats, e.g. H.225.0 and H.221
• Translation between communication procedures, e.g. H.245 and H.242
• Translation between audio and video codecs
• Call setup and clearing on both the LAN side and the switched-circuit network side Gateways are not required unless connections to other networks are needed.
2.1.4 Multiple Control Units
The Multi Control Unit (MCU) supports multipoint conferences, i.e., conferences between three or more endpoints. An MCU is required have a Multipoint Controller (MC) for signalling, and zero or more Multipoint Processors (MP) for mixing, switching and processing media streams and / or data bits. The MCU can be a single dedicated entity, or may be integrated in other components [64].
2.2 Security
Version 2 of the H.323 recommendation contains a number of improvements for IP telephony, including a completely new security recommendation, H.235, which was developed to provide a full security framework for H.323 and other multimedia systems. This recommendation provides services for authentication (which can be used for authorisation), privacy and integrity [63]. The services are accomplished through the use of so called security profiles which include [63]:
1) a simple, password-based profile
2) a profile using digital certificates and dependent on a fully-deployed public-key infrastructure
3) a combined use of both 1) and 2)
This can then be used in e.g. the encryption of a conference media stream. Use of these security profiles is optional and is activated by each unit. Endpoints may be required to use specific security features either by their gatekeeper, or by the recipient of any conference call.
3 Autonomic computing
The advances computing and telecommunication technologies have made in the last decades, have resulted in an explosive growth in computing systems and applications which impact all aspects of our daily life. However following the growth and vast possibilities of today, are problems. The applications are often highly advanced, heterogeneous and dynamic, and we can clearly see their configuration and management systems becoming more and more complex. This has led key firms and researchers to consider alternative approaches based on strategies used by biological systems. Autonomic Computing is one of these emerging new strategies. It was first proposed by Paul Horn, IBM’s Senior Vice President of Research, in a keynote presentation at the AGENDA 2001 Conference and later published in [50].
This chapter will present the Autonomic Computing concept, and further elaborate on the important characteristics of an Autonomic System which this thesis will focus on.
3.1 Definition and Characteristics
Autonomic Computing has been inspired by the human autonomic nervous system and is aimed at designing and construction systems that are self-managing [99]. According to Horn, there are eight general elements, or characteristics, which define a true Autonomic Computing system, and thus define such a system [50]. From these characteristics, several terms to describe them have been suggested [74]:
• Self-aware
o In order to be autonomic a computing system needs to “know itself”. It must also consist of components having a form of system identity.
• Self-configuring
o An autonomic computing system must be able to adapt dynamically to changes in the environment and configure and re-configure itself under varying and unpredictable conditions.
• Self-optimising
o Being an autonomic computing system means never settling for status quo, but always looking for ways to optimise how it works.
• Self-healing
o An autonomic system must also be able to perform some sort of healing, i.e. it must be able to recover from routine and extraordinary events which might cause some of its parts to malfunction. This must be done without any loss of data or noticeable delays in processing.
• Self-protecting
o Since a virtual world is no less dangerous than the physical one, an autonomic system must therefore be an expert in self-protection.
• Environment-aware
o An autonomic system is aware of the surroundings and context of its activity and acts accordingly.
• Openness
o The autonomic system does not exist in a hermetic environment. This means to be able to function in a heterogeneous world based on open standards.
• Anticipatory
o An autonomic system hides its complexity to the user while still anticipating the optimal resources which are needed in the future.
Any system which seeks to be a fully autonomous system must, according to Horn in [50]
possess at least all these characteristics.
3.2 Architecture
Some of the characteristics mentioned in section 3.1 form a necessary basis which other autonomic functions rely on. For example, according to Albus in [3] any autonomic system needs an internal model of its self and its environment, and thus needs to be self and environment aware [3]. The other functions will, according to Ganek and Corbi ([39]), be accomplished by taking an appropriate action based on one or more situations that they sense in their environment. This is realised by combining the basis characteristics with the use of control loops that collects details from the system and acts accordingly [39]. This control loop is shown in Figure 3-1.
Knowledge
Analyse Plan
Execute Monitor
Effector Sensor
Managed Component
Change Request
Change Symptom Plan
Figure 3-1 – The autonomic control loop [61]
As we see, the control loop revolve around knowledge, which, depending on which capability the control loop realises, may be information about the system itself, its environment or similar. Besides from the central knowledge component, the control loop consists of four separate parts that share knowledge [61]:
• The monitor function which provides mechanisms for collecting, aggregating, filtering and reporting details collected from a managed system.
• The analyse function provides the mechanisms that correlate and model complex situations.
• The plan function constructs actions needed to achieve goals and objectives.
• The execute function provides the system with mechanisms which control the execution of a given plan.
The four parts work together to provide the control loop functionality and communicate with one another and exchange appropriate knowledge and data. The components will be further described in the following sub-chapters. It is worth noticing that the autonomic computing architecture does not however prescribe the specific implementation choices for the autonomic manager, but rather specify how one should organise the internal structure [39].
3.2.1 Knowledge
Data used by the control loop’s for functions are stored as shared knowledge. This includes data such as topology information, detailed knowledge of its components, the current status of different variables, capacities, interconnections with other systems and available resources.
The use of such knowledge is in accordance with Eracar which states in [31] that a key factor for achieving self-awareness, an essential element of any autonomic system, is self-modelling.
The knowledge used in the control loop can, according to a white paper published by IBM ([61]), be obtained in one of three ways [61]:
1. The knowledge is sent to the control loop through a pre-determined interface. Policies and analysing rules are examples of such knowledge.
2. The knowledge might be retrieved from an external information service. This might be world model details about units in its environment, or updates on other services.
3. The control loop might create the knowledge itself. This could be internal system information which is collected through sensors or logging of notifications. Another feasible scenario is that the execute part of a control loop might update the knowledge to indicate which actions that have been carried out.
The autonomic computing blueprint also identifies several types of knowledge which is needed depending on the control loop function. This includes topology knowledge, policy knowledge and problem determination knowledge [61].
3.2.2 Monitor
As previously mentioned the monitor function collects the details from the system and organise them. These details may include topology information, configuration properties and so on, and is used to keep the central knowledge base up-to-date. Whenever an error symptom is detected, this is passed to the analyse function [61] [39].
3.2.3 Analyse
The analyse function observe and analyse situations to determine whether changes need to be made. As an example, the requirement to enact a change in configuration might occur when the analyse function determine that some policy is not being met [39]. Initially, such responses will probably follow rules generated by human experts, but will probably over time will be supplemented by self-learning processes inherent in an autonomic system [74].
If there are any changes which are deemed necessary or desirable, the analyse component will send a change request to the plan function, including the prescribed modifications [61] [39].
3.2.4 Plan
When a change request is sent, it is the plan component which creates or selects a procedure which will enact the desired alteration. This plan function can take many forms, ranging from a single command to a complete work flow [39].
3.2.5 Execute
Finally, the execute function provides the mechanisms needed to schedule and perform the necessary changes to the system. It is responsible for carrying out the procedure generated by the plan component [61].
4 Ontologies
The word “ontology” has through the years been used in a number of contexts and interpreted in just as many ways. Originally, the term ontology came from philosophy and the study of metaphysics. Metaphysics tries to give a general and fundamental account of the way the world exists and how. In real-life, it is impossible to represent the real world with all its possible details. Therefore, to be able to represent a phenomena or a small part of the world, it is necessary to focus on a limited number of concepts which provide a sufficient and relevant abstraction [20]. This brings us closer to the modern concept of ontology normally used in computer science; a formal point of view where certain features and attributes of objects become more relevant than others. Guariano names in [48] several fields which have embraced the ontology concept including knowledge engineering, knowledge representation, qualitative modelling, language engineering, database design, information retrieval and extraction, and knowledge management and organisation. This concept will be presented closer in the following.
4.1 Definition of Ontologies
In some cases, the term “ontology” is just used as a fancy name for denoting the results of modelling activities carried out by means of standard methodologies. There are, however, numerous formal descriptions and definitions of what an ontology is. In [35], Fensel described it as:
“(…) a shared and common understanding of a domain that can be communicated between people and heterogeneous and distributed systems.” [35]
Although it captures most of the essence of the common concept of an ontology, Fensel’s definition captures also a lot of other concepts which are not an ontology like e.g. general UML models, purely textual descriptions of domains etc. Humns and Singh shared most of Fensel’s definition, but restricted the ontology concept to only deal with computerised versions when they in [59] defined an ontology as:
“(…) [an ontology is] a computer model of some portion of the world.” [59]
This is however also quite ambiguous as it can also be used to categorise e.g. a computerised 3D model of a landscape. Gruber’s definition as presented in [45], is more specific:
“(…) an ontology is an explicit specification of a conceptualisation.” [45]
This is the most referenced definition in the literature, but has later been considered as too broad. In 1998, Studer et al. slightly modified Gruber’s definition in [103] and stated:
“(…) an ontology is a formal and explicit specification of a shared conceptualisation.”
[103]
This accentuates the need for a formalisation of the knowledge, as well as stress the fact that the ontology should be based on a shared, and thus consensual, conceptualisation of the domain. This means that a central aspect of the development of an ontology is the
development of the conceptualisation. A conceptualisation represents a way to conceive the world and decide what to model in a knowledge representation. Studer et al. [103] defines a conceptualisation as:
“Conceptualization refers to an abstract model of some phenomenon in the world by having identifed the relevant concepts of that phenomenon.” [103]
In addition to this, both Gruber and Studer et al. based their work on a definition by Genesereth and Nilsson which stated:
“(… ) [a conceptualisation is] a structure <D,R>, where D is a domain and R is a set or relevant relations on D” [42]
According to this, an ontology is thus a form of knowledge representation and way to express an information model in an explicit and machine readable form. It contains the consensual knowledge of a domain and will contain all types of basic concepts with attributes as well as relationships and related constraints between different concepts.
When the terms ontology and conceptualisation is used throughout the rest of this thesis, they are referring to Studer’s definitions.
4.2 Reasons for Use
Yang and Calmet state in [117] that a key reason for employing ontologies in intelligent systems is that they:
“(…) enable the representation of background knowledge about a domain in a machine understandable form.” [117]
This fits very well with the research statement of this thesis. In addition to this, Noy and McGuinness points in [84] out several other possible reasons for developing ontologies [84]:
• Share common domain information among humans and machine agents
o Ontologies can be used to ensure that all participants of a domain share a common understanding of important terms and concepts. This can be utilised in e.g. a computer agent used for price searches after books if all the book shops share and publish the same ontology for all their terms.
• Enable reuse of domain knowledge
o Once an ontology has been created, it can be reused in other ontologies and domains. This means that if one needed to build a large ontology, this could be put together by integrating several existing ontologies, each describing a small part of the complete ontology.
• Analyse domain knowledge and make it more explicit
o Explicitly stating domain knowledge makes it easier for new users to learn what terms in a given domain mean. It also facilitate for formal analysis of a domain.
• Separate domain knowledge from the operational knowledge
o By separating the domain knowledge the implementation e.g. by describing a task which put together components into a radio according to a required (ontological) specification and implementing an algorithm which does this assembly independently of the product and its components.
All these, combined or by them self, have led to rapid adoption from numerous research fields [48].
4.3 General Structure of an Ontology
An ontology has three basic elements [84]; description of concepts (classes) in a domain, properties (slots, also called roles) of each concept describing various features and attributes of the given concept and restrictions (facets, also called role restrictions) on these slots.
These, combined with a set of individual instances of classes constitute what we call a knowledge base. There is often in reality a fine line between where the ontology end and where the knowledge base begins.
4.3.1 Concepts / Classes
Classes describe the concepts in a domain and are often the main focus point of most of today’s ontologies. As an example, a class of animals represent all animals, while a class of plants represent all plants. Classes and concepts will usually constitute a taxonomic hierarchy where a class can also have subclasses which are more specific than a superclass [84]. This means that the example class of animals can be divided into e.g. fish, birds, mammals and reptiles. These subclasses can later be divided into further sub-subclasses, e.g. by dividing the mammal class into carnivore and herbivore and a necrophagous class to the carnivore class, see Figure 4-1. It worth noticing that ontology hierarchies are often depicted as “is-a”
hierarchies, where the sub-concepts “is-a” super-concept.
Figure 4-1 – An example ontology of animals
4.3.2 Slots / Properties
Each concept has different properties which describe various features and attributes.
Continuing on the animal example from above; a mammal can have four legs and has a preference for what to eat (e.g. a plant or another animal). The latter being an object property, also called a relation, and the former a datatype property. An object property, which is also called a relation, relates individuals to other individuals. A datatype property relates individuals to datatype values such as booleans, integers, floats and strings.
In most ontology representation languages, a property is a binary relation. This means that it can only be used to link together two individuals or one individual and a value. There are however cases where the natural, and certainly most convenient, way to represent certain aspects is to link an individual to several individuals or values. These are called n-ary relations. One common solution to this problem, which lately also has been endorsed by the WC3 [85], is the introduction of a new class for the relation. This class will contain n new properties to represent the n-ary relation. This is illustrated in Figure 4-2.
Mammal
Herbivore Carnivore
Necrophagous Animal
Reptile Plant
Thing
is-a is-a
is-a is-a
is-a is-a
is-a
Figure 4-2 - Example of a n-ary relation [85]
This approach has been extensively used throughout the ontology modelling in chapter 8.1.
4.3.3 Restrictions / Facets
A property might have different restrictions describing the value type, allowed values, the number of values (cardinality) or other features of the values which the slot can take. As an example, the “name” property of a person is a string. This means that the only valid values of the property are a type of string. Common restrictions are [84]:
• Cardinality, i.e. how many values a property may have. Some systems distinguish only between single and multiple cardinalities, while others allow for a precise definition. It is also common for some to allow for a minimum and maximum cardinality restriction.
• The valid value types of a property, i.e. what kinds of values can fill a property (e.g. a Boolean).
• The domain and range of a property.
What restrictions that can be legally stated will depend on the chosen ontology representation language.
4.3.4 Individuals / Instances
Individuals are the actual things represented by a concept, such as Bob, being a instance of the concept of Man.
Person Diagnosis_Relation
Disease
Probability has_diagnosis
diagnosis_value
diagnosis_probability
5 Semantic Web Technologies
The immense growth of the current World Wide Web which now contains billions of documents has transformed the concept of information. In response to its growing complexity, later initiatives by the World Wide Web Consortium point toward a new world where all the information will be machine processable and understandable; the world of the Semantic Web.
Such technologies can, however, have several other uses which are interesting in self- configuring systems. As they may help machines understand arbitrary information, maybe it can also help them publish and process information about themselves and other units in the domain. In this chapter, a thorough introduction to the different technologies and parts of the Semantic Web is given.
5.1 The Semantic Web
”(...) the Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in co-operation.”
[12]
- Tim Berners-Lee et al.
In 2001 the inventor of the World Wide Web, Tim Berners-Lee, and some of his colleagues presented in [12] their vision of the future for the web. The vision was for a “web of meaning” which was designed to enable computers to help us seek out information and even derive information from the available sources that did not exist before. In this vision computers as well as people should be able to:
“(…) read, find, understand and use data over WWW to accomplish useful goals” [88]
To accomplish this, the information on the WWW needs to be understandable for computers.
The concept “Semantic Web” was therefore not a vision where the computers tries to learn natural languages, but rather an extension of the current WWW where the available information is given a well-defined meaning or semantics. Thus, one of the key differences between the Semantic Web and the current Web lies how the information is represented:
“(…) the Semantic Web is supposed to make data located anywhere on the Web accessible and understandable, both to people and to machines.” [88]
In the present human-oriented Web, machines are only expected to reason on a syntactical level. Their main focus is largely on how text and images should be rendered for human viewing. The Semantic Web however, is intended to use a representation of the information which will allow machines to process and reason about the information at the semantic level.
Berners-Lee et al presented in vision ([12]) an advanced example of such sort, where a personal assistant automatically schedules tasks for his owner based on information which itself retrieves from the Semantic Web without requiring much human interaction.
As presented in [12] by Berners-Lee et al., the main components of the Semantic Web can be described as:
• Well-structured annotations (expressed in e.g. XML) on web pages which extend the traditional WWW and enable agents to capture some key concepts such as the author and his / her contact information. Adding such meta-data will allow us to e.g. search for documents created by different people.
• These key concepts can be linked to some further information contained within ontologies. This makes it possible for any agent to derive the “meaning” of the information and from this possibly deduce further information through the defined inference rules and restrictions.
• The concepts stored in these ontologies are uniquely identified by the use of URIs as identifiers. The use of URIs ensures also that the concepts are not only words in a document but are tied to a unique definition that everyone can find on the Web. This provides a way to create bridges between ontologies by linking concepts which exist in both ontologies with each other.
• The Software agents which access this information are computer programs or scripts which act on the behalf of a human (or organisation) and are therefore often considered as “personal assistants”. In contrast to other computer programs, software agents inhabit properties like the ability to adapt, autonomy of actions and learning aptitude. They are expected to carry out their tasks through communicating with each other, using ontologies and inference capabilities.
• A need for trust and encryption is also present, but this is not described in much detail.
5.1.1 Proposed Architecture
To realise the Semantic Web there are several different technologies which need to be in place. Its architecture can be represented in the form of a cake of specifications and languages layered out on top of each other. This is shown in Figure 5-1, where the main components as pointed out by Berners-Lee in [10] and later described by Matthews in [77] are depicted.
Figure 5-1 - The Layered Architecture of the Semantic Web [10]
5.1.1.1 Unicode and URI
The bottom layer of the Semantic Web architecture is made up by Unicode, which is the standard representation of characters ([105]), and Universal Resource Identifiers (URI, [11]), which is a generalised addressing mechanism for specifying a unique address for an item. A Uniform Resource Locator (URL) is a type of URI. This layer is responsible for representing characters and uniquely identifying web resources [77].
5.1.1.2 XML
The eXtensible Markup Language (XML) grew out of the demand to make HTML more flexible by allowing for the addition of arbitrary structures in web documents [32]. Its main advantage comes from the fact that it facilitates easy formation of a structured document. The technology itself has two levels. On the bottom level, it is an open standard which describe how to build a tree-based data structures using markup tags, while on a more conceptual level, it can be used as a strategy for information management. The structuring has no particular semantics to indicate what the structure means. XML plays only the role of a syntax carrier and thus corresponds to a basic syntax layer [77].
5.1.1.3 RDF
The Resource Description Framework (RDF) is a family of standards which extends the bottom two layers by allowing documents to be described in the form of metadata. This metadata is a mechanism to give meaning to the data [29].