首页 survey of software development approaches addressing dependability

survey of software development approaches addressing dependability

举报
开通vip

survey of software development approaches addressing dependability N. Guelfi et al. (Eds.): FIDJI 2004, LNCS 3409, pp. 78–90, 2005. © Springer-Verlag Berlin Heidelberg 2005 A Survey of Software Development Approaches Addressing Dependability Sadaf Mustafiz and Jörg Kienzle School of Computer Science, McGill Universit...

survey of software development approaches addressing dependability
N. Guelfi et al. (Eds.): FIDJI 2004, LNCS 3409, pp. 78–90, 2005. © Springer-Verlag Berlin Heidelberg 2005 A Survey of Software Development Approaches Addressing Dependability Sadaf Mustafiz and Jörg Kienzle School of Computer Science, McGill University, Montreal, Quebec, Canada sadaf@cs.mcgill.ca, joerg.kienzle@mcgill.ca Abstract. Current mainstream software engineering methods rarely consider dependability issues in the requirements engineering and analysis stage. If at all, they only address it much later in the development cycle. Concurrent, distrib- uted, or heterogeneous applications, however, are often deployed in increas- ingly complex environments. Such systems, to be dependable and to provide highly available services, have to be able to cope with abnormal situations or failures of underlying components. This paper presents an overview of the software development approaches that address dependability requirements and other non-functional requirements like timeliness, adaptability and quality of service. Software development methods, frameworks, middleware, and other proposed approaches that integrate the concern of fault tolerance into the early software development stages have been studied. The paper concludes with a comparison of the various approaches based on several criteria. 1 Introduction Due to the increasing responsibilities and number of requirements that modern appli- cations have to address, the average complexity of software systems is growing. Elaborate user interfaces, multi-media features or interaction with real-time devices require software to respond promptly and reliably. Situations such as node failures, network partitions, overloaded resources, irregular load, component failures, hetero- geneity, abnormal behavior of subsystems or the environment, and also software de- sign faults must be handled in order to provide highly available services. Surprisingly enough, dependability and fault tolerance are not addressed by current mainstream software engineering methods. In general, dependability and fault toler- ance are considered “non-functional” requirements, and therefore considered too late during the development of an application. Ad-hoc solutions that try increase depend- ability by adding fault tolerance once the main functionality of the system has been implemented often result in complex system structure, hard-to-maintain code and poor performance. This paper summarizes the results of a survey of specialized software development methods, frameworks, middleware, software architectures, and other approaches that assist developers in producing dependable software. Dependability can be attained by fault prevention, fault removal, fault tolerance1 and fault forecasting [3]. The investi- 1 An overview of software fault tolerance techniques can be found in [36]. A Survey of Software Development Approaches Addressing Dependability 79 gation focuses on the non-functional requirements that are part of dependability, i.e. availability, reliability, safety, security, and maintainability [2], but also timeliness, which includes responsiveness, orderliness, freshness, temporal predictability and temporal controllability [29]. Adaptability, i.e. the need to remain functional even when modifications are carried out in the system, is also considered. In addition the review includes, for each approach, its application environment, the covered failure domain, and what fault tolerance techniques, if any, have been incorporated into the process. The approaches presented in this paper are structured into three categories. Sec- tion 2 reviews software development methods. Section 3 discusses software architec- tures, middlewares and frameworks. Section 4 presents other approaches that propose notations or consider elements that help in the development of dependable systems. Finally, Section 5 presents a comparison of the surveyed approaches. 2 Software Development Methods Software development methods define a step-by-step process that leads a developer from the elaboration of an initial requirements document, over analysis, architecture and design phases through to the final implementation. 2.1 HRT-HOOD HOOD (Hierarchical Object-Oriented Design) [32] is an architectural design method developed by the European Space Agency in 1987, with Ada as the target program- ming language. HRT-HOOD (Hard Real-Time HOOD) [9] was later developed to addresses issues of timeliness in the early stages of the development process, with explicit support for common hard real-time abstractions. HRT-HOOD introduces cyclic and sporadic type objects to take into account timing properties of real-time systems. These objects are annotated with information about the period of execution, minimum arrival time, offset times, deadlines, budget times, worst-case execution time (WCET), and importance. HRT-HOOD uses exceptions to handle timing faults. The coding language should have support available to program recovery handling. In cases of sporadic objects, method invocation should be monitored in order to prevent early execution or overly high invocation frequency. The method does not provide fault-tolerance support or ways of identifying the mentioned non-functional require- ments, but focuses on how to integrate them into the design phase. The STOOD tool supports real-time software development based on the HOOD (version 4) and HRT- HOOD method. 2.2 The OOHARTS Approach Object-Oriented Hard Real Time System (OOHARTS) [35] is a process for develop- ing dependable hard real-time systems. It is based on UML and the hard real-time constructs of HRT-HOOD. Various extensions to UML are proposed, e.g. stereotypes such as <>, <>, <>, <>, and <> to describe different real-time objects. A special form of UML state diagram called Object Behavior Chart (OBC) is used to define object behavior. It provides means for representing timing constraints like deadline and period. The UML concur- rency attribute, which can be sequential, guarded, or concurrent, is extended to in- clude <> (mutual exclusion), <> (write execution request), and <> (read execution request). The OOHARTS method follows the traditional software development phases. Both functional and non-functional requirements are specified in the requirements defini- tion phase. It introduces an additional phase in the HRT-HOOD software develop- ment life cycle, hard-real time analysis, which provides a framework for defining the structure and behavior of hard real-time systems using UML and the new extensions defined [35]. 2.3 Extension of the Catalysis Method In [31], a fault-tolerant software architecture for component-based systems based on the idealized fault-tolerant component (IFTC) [38][44] is proposed. The architecture can handle software faults, providing higher levels of dependability. Based on this work, [45] proposes a way of incorporating exception handling and error recovery into the Catalysis [17] process. At the requirements level, exceptional behavior, which includes recovery scenario and failure scenario, is added to use-case specifications in a formal manner. The system is structured with IFTC and the propa- gation of exceptions is clearly modeled. In the next phase, collaborations are derived from the use-cases. Pre- and post conditions are mapped to actions, which include refinements of the defined exceptions. A template is used to describe the collabora- tion, and class hierarchies of normal and exceptional behavior are produced. Follow- ing the design, ways to move on to implementation are suggested. 2.4 The KAOS Approach The KAOS framework [22] provides a goal-oriented approach for requirements mod- eling, specification, and analysis, which address both functional and non-functional requirements. Three types of non-functional goals are considered: quality-of-service, development, and architectural constraints. These goals address the need for safety, security, usability, performance, interoperability, accuracy, maintainability, reusabil- ity, and issues of distribution, and physical and logical organization. Exceptional behavior, defined as obstacles, is also addressed during requirements engineering [48]. Goals and obstacles are expressed in a formal language. Based on this frame- work, Lamsweerde has proposed a method in [49] for deriving the software architec- ture from the requirements. To begin with, the software specification is developed from the requirements, which is then used to build the architectural design. The de- sign evolves with recursive refinements, which consider constraints and non- functional goals. The refinement is pattern-based; Figure 1 for example shows how to introduce reliable communication by means of replication. The KAOS approach is supported by the GRAIL tool [22]. A Survey of Software Development Approaches Addressing Dependability 81 Fig. 1. Architectural refinement pattern for a QoS goal [49] 2.5 The B Method The B formal method [1] covers the development process from the specification to the implementation phase and is based on a mathematical model of set theory and first order logic. The B method mainly comprises two activities: writing formal texts and proving the texts. The process evolves from specification to coding using a series of refinements. The methodology has been used for developing error-free software for critical systems by focusing on traceability of safety-related constraints [10]. The method is supported by the tools B-Toolkit and Atelier-B. Modeling tools (B4free), editors, and parsers are also available for B. 3 Fault Tolerance Frameworks and Middleware This section reviews software architectures, middlewares and frameworks. Software architectures do not offer any methodological support, but instead provide a structure (usually hardware design) based on which applications can be built. Middleware is software that is used to integrate heterogeneous software applications or products efficiently and reliably in a distributed computing environment. It is the middle layer between the application program and the platform and provides abstractions necessary for interfacing. A framework is an environment composed of software components that can be tailored according to the needs of the application being developed. Finally, a middleware framework is a structure that offers users multiple middleware styles that can be customized for application as well as device constraints. 3.1 TARDIS The Timely and Reliable Distributed Information Systems (TARDIS) project [8] was initiated in 1990, and is targeted towards avionics, process control, military, and safety critical applications. The proposed framework addresses non-functional re- quirements (dependability, timeliness, and adaptability), and implementation con- straints from the early stages of software development. In the architectural design phase, issues of choices are addressed, for example, between replication and dynamic reconfiguration for improving reliability. The framework is generic, and does not impose any software design methods or languages on the developer. 82 Sadaf Mustafiz and Jörg Kienzle The initial proposal, however, was not completed. The project continued with fo- cus on development of real-time systems. [28] discusses the architectural design of non-functional requirements related to real-time issues using the specification lan- guage Z and RTL (Real-Time Logic). Detailed design using TARDIS is considered in [7][28]. According to [28], the TARDIS framework can also be applied to the design of systems where non-functional requirements like reliability, security, safety, fault tolerance, and system reconfiguration need to be satisfied. 3.2 TIRAN TaIlorable fault toleRANce frameworks for embedded applications (TIRAN) [20] is a European Strategic Program for Research in Information Technology (ESPRIT) project completed in October 2000. The primary goal of the project was to develop a software framework to provide fault tolerant capabilities to embedded automation systems. The framework, to reduce development costs, aims to solve problems in fault-affected applications by considering error detection, isolation and recovery, reconfiguration and graceful degradation. It considers physical and design faults in the permanent, temporary omission and byzantine failure domains. The framework provides a library of basic tools implementing fault tolerance mechanisms like watchdog, distributed memory, local voter, output delay, stable memory, distributed synchronization and time-out management. A control backbone, which functions as a middleware, extracts information about the application’s topol- ogy, its progress and its status. It maintains this information in a replicated database and coordinates fault tolerance actions at runtime via user-defined recovery strategies. A domain-specific language named ARIEL was developed as part of the project to configure the basic tools and to specify the recovery strategies. More information about ARIEL can be found in [20]. TIRAN provides the users of the framework with a methodology for collecting, specifying, and validating fault tolerance requirements, with a characterization of framework elements, and guidelines for using the framework. The specification of fault tolerance is based primarily on UML package diagrams and class diagrams, and TRIO (Tempo Reale ImplicitO) temporal logic, a language that has been developed by ENEL (Italy’s largest power distributer) specifically for real-time systems. The use of the methodology has been experimented on a pilot application, a primary substa- tion automation system, and is discussed in [18][25]. 3.3 DepAuDE Dependability for embedded Automation systems in Dynamic Environments with intra-site and inter-site distribution aspects (DepAuDE) [24] is an IST (Information Society Technologies) project partially based on TIRAN completed in 2003. It has been developed primarily for two target application areas: monitoring/control of en- ergy transport and distribution, and distributed embedded systems. The DepAuDE framework provides “a methodology and an architecture to ensure dependability for non-safety critical, distributed, embedded automation systems with both IP (inter-site) and dedicated (intra-site) connections” [21]. The methodology support is similar to that outlined in TIRAN, but includes inter-site communication A Survey of Software Development Approaches Addressing Dependability 83 features for specification, validation, and modeling of requirements. It also adds sup- port for quality-of-service (QoS) levels. Furthermore, the DepAuDE framework has been applied on the pilot applications to evaluate and show the feasibility of the framework. 3.4 EFTOS: FT Approach to Embedded Supercomputing Embedded Fault-Tolerant Supercomputing (EFTOS) is an ESPRIT project completed in 1998, targeted towards industrial process-control, real-time applications, and em- bedded systems. It aims to provide a middleware framework to implement fault- tolerance to make embedded supercomputing applications more dependable. Similar to TIRAN, EFTOS also follows a layered approach comprising of basic fault-tolerance tools and mechanisms, a backbone, and a high-level recovery language for specifying recovery strategies [23]. The FT tools provided include a watchdog timer, a trap handler for exception handling, an atomic action tool, assertions, and a distributed voting mechanism. 3.5 Middleware Architectures DCE (Distributed Computing Environment), DCOM (Distributed Component Object Model), Java RMI (Remote Method Invocation), and CORBA (Common Object Request Broker Architecture) are general middleware that have limited fault tolerance support, like mechanisms for replication and time-outs [46]. TAO (The ACE ORB) implementation of CORBA supports fixed-priority real-time scheduling. Electra, another CORBA implementation, provides fault-tolerance with object replication. Real-time CORBA 1.0 supports QoS with standard policies and techniques [46]. CORBA also defines a transaction service (OTS). Chameleon is an adaptive infrastructure, which supports multiple fault-tolerance strategies in a networked environment. Chameleon uses reliable agents that support user-specified levels of fault-tolerance. It considers satisfying dependability in terms of availability. With some additional features, chameleon can be used for real-time applications [19][4]. ROAFTS is a middleware architecture providing real-time object-oriented adap- tive fault-tolerance support. ROAFTS offers fault-tolerance schemes that can be ap- plied to both process-structured and object-structured distributed real-time (RT) ap- plications. These schemes are used to tolerate processor faults, communication link faults, interconnection network faults, and application software faults. ROAFTS is meant for implementation on COTS (Commercial Off-The-Shelf) and guarantees RT fault-tolerance when required [19][37]. FRIENDS (Flexible and Reusable Implementation Environment for your Next Dependable System) is a software architecture, which provides fault-tolerance and limited security support. It is built on subsystems and libraries of meta-objects. There is a fault-tolerance sub-system that incorporates fault-tolerance mechanisms for error detection, failure detectors, replication, reconfiguration, and stable storage. It does not provide specific support for real-time and quality-of-service requirements [19][27]. AQuA (Adaptive Quality of Service for Availability) is an adaptive architecture for building dependable distributed systems. Fault tolerance is provided by Proteus, a 84 Sadaf Mustafiz and Jörg Kienzle dependability manager integrated into the architecture. Fault tolerance support is given to CORBA applications with replication of objects, and different levels of de- sired dependability and quality-of-service are provided. AQuA is capable of handling crash failures, value faults, and time faults. It incorporates means for detecting errors, treating faults, and reliable communication [19][14]. 3.6 Software Architectures Some architectures considering fault-tolerance and other dependability attributes worth mentioning are discussed below. Because of space constraints, it was not possi- ble to describe them in details. Delta-4 [5], an ESPRIT project, provides an open architecture for development of dependable distributed real-time systems. Delta-4 tolerates hardware failures with hardware and software redundancy, and also supports active and passive replication of software components residing in homogeneous computers. Voting mechanisms and systematic and periodic strategies for check-pointing are provided. MAFTIA (Malicious and Accidental Fault Tolerance for Internet Applications) is a European Union project completed in 2003 and is said to be the first project to ad- dress the need to tolerate malicious and accidental faults in large-scale distributed systems [39]. GUARDS (Generic Upgradeable Architectures for Real-Time Dependable Sys- tems) [41] is an ESPRIT project aiming to provide methods, techniques, and tools for design, implementation, and validation support in safety-critical real-time systems. MARS (Maintainable Real-Time System) [43] is an architecture specialized for time-triggered applications, and addresses fault-tolerance with active replication means and other hardware FT measures to satisfy hard real-time requi
本文档为【survey of software development approaches addressing dependability】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_444938
暂无简介~
格式:pdf
大小:177KB
软件:PDF阅读器
页数:13
分类:
上传时间:2011-04-15
浏览量:24