2007-Software testing research Achievements, challenges, dreams_Bertolino

2007-Software testing research Achievements, challenges, dreams_Bertolino Antonia Bertolino (http://www.isti.cnr.it/People/A.Bertolino) is a Research Director of the Italian National Research Council at ISTI in Pisa, where she leads the Software Engineering Laboratory. She also coordinates the Pisatel laboratory, sponsored by Er...

Antonia Bertolino (http://www.isti.cnr.it/People/A.Bertolino) is a Research Director of the Italian National Research Council at ISTI in Pisa, where she leads the Software Engineering Laboratory. She also coordinates the Pisatel laboratory, sponsored by Ericsson Lab Italy. Her research interests are in architecture-based, component-based and service-oriented test methodologies, as well as methods for analysis of non-functional properties. She is an Associate Editor of the Journal of Systems and Software and of Empirical Software Engineering Journal, and has previously served for the IEEE Transactions on Software Engineering. She is the Program Chair for the joint ESEC/FSE Conference to be held in Dubrovnik, Croatia, in September 2007, and is a regular member of the Program Committees of international conferences, including ACM ISSTA, Joint ESEC-FSE, ACM/IEEE ICSE, IFIP TestCom. She has (co)authored over 80 papers in international journals and conferences. Software Testing Research: Achievements, Challenges, Dreams Antonia Bertolino Future of Software Engineering(FOSE'07) 0-7695-2829-5/07 $20.00 © 2007 Software Testing Research: Achievements, Challenges, Dreams Antonia Bertolino Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” Consiglio Nazionale delle Ricerche 56124 Pisa, Italy antonia.bertolino@isti.cnr.it Abstract Software engineering comprehends several disciplines devoted to prevent and remedy malfunctions and to war- rant adequate behaviour. Testing, the subject of this paper, is a widespread validation approach in industry, but it is still largely ad hoc, expensive, and unpredictably effective. Indeed, software testing is a broad term encompassing a va- riety of activities along the development cycle and beyond, aimed at different goals. Hence, software testing research faces a collection of challenges. A consistent roadmap of the most relevant challenges to be addressed is here pro- posed. In it, the starting point is constituted by some im- portant past achievements, while the destination consists of four identified goals to which research ultimately tends, but which remain as unreachable as dreams. The routes from the achievements to the dreams are paved by the outstand- ing research challenges, which are discussed in the paper along with interesting ongoing work. 1. Introduction Testing is an essential activity in software engineering. In the simplest terms, it amounts to observing the execu- tion of a software system to validate whether it behaves as intended and identify potential malfunctions. Testing is widely used in industry for quality assurance: indeed, by directly scrutinizing the software in execution, it provides a realistic feedback of its behavior and as such it remains the inescapable complement to other analysis techniques. Beyond the apparent straightforwardness of checking a sample of runs, however, testing embraces a variety of activ- ities, techniques and actors, and poses many complex chal- lenges. Indeed, with the complexity, pervasiveness and crit- icality of software growing ceaselessly, ensuring that it be- haves according to the desired levels of quality and depend- ability becomes more crucial, and increasingly difficult and expensive. Earlier studies estimated that testing can con- sume fifty percent, or even more, of the development costs [3], and a recent detailed survey in the United States [63] quantifies the high economic impacts of an inadequate soft- ware testing infrastructure. Correspondingly, novel research challenges arise, such as for instance how to conciliate model-based derivation of test cases with modern dynamically evolving systems, or how to effectively select and use runtime data collected from real usage after deployment. These newly emerging challenges go to augment longstanding open problems, such as how to qualify and evaluate the effectiveness of testing criteria, or how to minimize the amount of retesting after the software is modified. In the years, the topic has attracted increasing interest from researchers, as testified by the many specialized events and workshops, as well as by the growing percentage of testing papers in software engineering conferences; for in- stance at the 28th International Conference on Software En- gineering (ICSE 2006) four out of the twelve sessions in the research track focused on “Test and Analysis”. This paper organizes the many outstanding research challenges for software testing into a consistent roadmap. The identified destinations are a set of four ultimate and un- achievable goals called “dreams”. Aspiring to those dreams, researchers are addressing several challenges, which are here seen as interesting viable facets of the bigger unsolv- able problem. The resulting picture is proposed to the soft- ware testing researchers community as a work-in-progress fabric to be adapted and expanded. In Section 2 we discuss the multifaced nature of software testing and identify a set of six questions underlying any test approach. In Section 3 we then introduce the structure of the proposed roadmap. We summarize some more mature research areas, which constitute the starting point for our journey in the roadmap, in Section 4. Then in Section 5, which is the main part of the paper, we overview several outstanding research challenges and the dreams to which they tend. Brief concluding remarks in Section 6 close the paper. 1 Future of Software Engineering(FOSE'07) 0-7695-2829-5/07 $20.00 © 2007 2. The many faces of software testing Software testing is a broad term encompassing a wide spectrum of different activities, from the testing of a small piece of code by the developer (unit testing), to the cus- tomer validation of a large information system (acceptance testing), to the monitoring at run-time of a network-centric service-oriented application. In the various stages, the test cases could be devised aiming at different objectives, such as exposing deviations from user’s requirements, or assess- ing the conformance to a standard specification, or evaluat- ing robustness to stressful load conditions or to malicious inputs, or measuring given attributes, such as performance or usability, or estimating the operational reliability, and so on. Besides, the testing activity could be carried on ac- cording to a controlled formal procedure, requiring rigor- ous planning and documentation, or rather informally and ad hoc (exploratory testing). As a consequence of this variety of aims and scope, a multiplicity of meanings for the term “software testing” arises, which has generated many peculiar research chal- lenges. To organize the latter into a unifying view, in the rest of this section we attempt a classification of problems common to the many meanings of software testing. The first concept to capture would be what is the common de- nominator, if it exists, between all possible different testing “faces”. We propose that such a common denominator can be the very abstract view that, given a piece of software (whichever its typology, size and domain) testing always consists of observing a sample of executions, and giving a verdict over them. Starting from this very general view, we can then con- cretize different instances, by distinguishing the specific as- pects that can characterize the sample of observations: WHY: why is it that we make the observations? This question concerns the test objective, e.g.: are we looking for faults? or, do we need to decide whether the product can be released? or rather do we need to evaluate the usability of the User Interface? HOW: which sample do we observe, and how do we choose it? This is the problem of test selection, which can be done ad hoc, at random, or in systematic way by applying some algorithmic or statistical technique. It has inspired much research, which is understandable not only because it is intellectually attractive, but also because how the test cases are selected -the test criterion- greatly influences test efficacy. HOW MUCH: how big of a sample? Dual to the ques- tion of how do we pick the sample observations (test se- lection), is that of how many of them do we take (test ad- equacy, or stopping rule). Coverage analysis or reliability measures constitute two “classical” approaches to answer such question. WHAT: what is it that we execute? Given the (possi- bly composite) system under test, we can observe its ex- ecution either taking it as a whole, or focusing only on a part of it, which can be more or less big (unit test, compo- nent/subsystem test, integration test), more or less defined: this aspect gives rise to the various levels of testing, and to the necessary scaffolding to permit test execution of a part of a larger system. WHERE: where do we perform the observation? Strictly related to what do we execute, is the question whether this is done in house, in a simulated environment or in the target final context. This question assumes the highest relevance when it comes to the testing of embedded systems. WHEN: when is it in the product lifecycle that we per- form the observations? The conventional argument is that the earliest, the most convenient, since the cost of fault re- moval increases as the lifecycle proceeds. But, some obser- vations, in particular those that depend on the surrounding context, cannot always be anticipated in the laboratory, and we cannot carry on any meaningful observation until the system is deployed and in operation. These questions provide a very simple and intuitive char- acterization schema of software testing activities, that can help in organizing the roadmap for future research chal- lenges. 3. Software testing research roadmap A roadmap provides directions to reach a desired desti- nation starting from the “you are here” red dot. The soft- ware testing research roadmap is organised as follows: • the “you are here” red dot consists of the most notable achievements from past research (but note that some of these efforts are still ongoing); • the desired destination is depicted in the form of a set of (four) dreams: we use this term to signify that these are asymptotic goals at the end of four identified routes for research progress. They are unreachable by defini- tion and their value exactly stays in acting as the poles of attraction for useful, farsighted research; • in the middle are the challenges faced by current and future testing research, at more or less mature stage, and with more or less chances for success. These chal- lenges constitute the directions to be followed in the journey towards the dreams, and as such they are the central, most important part of the roadmap. The roadmap is illustrated in Figure 1. In it, we have situated the emerging and ongoing research directions in the center, with more mature topics -the achievements- on their Future of Software Engineering(FOSE'07) 0-7695-2829-5/07 $20.00 © 2007 Figure 1. Roadmap Future of Software Engineering(FOSE'07) 0-7695-2829-5/07 $20.00 © 2007 left, and the ultimate goals -the dreams- on their right. Four horizontal strips depict the identified research routes toward the dreams, namely: 1. Universal test theory; 2. Test-based modeling; 3. 100% automatic testing; 4. Efficacy-maximized test engineering. The routes are bottom-up ordered according somehow to progressive utility: the theory is at the basis of the adopted models, which in turn are needed for automation, which is instrumental to cost-effective test engineering. The challenges horizontally span over six vertical strips corresponding to the WHY, HOW, HOW MUCH, WHAT, WHERE, and WHEN questions characterizing software testing faces (in no specific order). Software testing research challenges find their place in this plan, vertically depending on the long term dream, or dreams, towards which they mainly tend, and horizontally according to which question, or questions, of the introduced software testing characterization they mainly center on. In the remainder of this paper, we will discuss the ele- ments (achievements, challenges, dreams) of this roadmap. We will often compare this roadmap with its 2000’s prede- cessor by Harrold [43], which we will refer henceforth as FOSE2000. 4. You are here: Achievements Before outlining the future routes of software testing re- search, a snapshot is here attempted of some topics which constitute the body of knowledge in software testing (for a ready, more detailed guide see also [8]), or in which im- portant research achievements have been established. In the roadmap of Figure 1, these are represented on the left side. The origins of the literature on software testing date back to the early 70’s (although one can imagine that the very no- tion of testing was born simultaneously with the first expe- riences of programming): Hetzel [44] dates the first confer- ence devoted to program testing to 1972. Testing was con- ceived like an art, and was exemplified as the “destructive” process of executing a program with the intent of finding er- rors, opposed to design which constituted the “constructive” party. It is of these years Dijkstra’s topmost cited aphorism about software testing, that it can only show the presence of faults, but never their absence [25]. The 80’s saw the assumption of testing to the status of an engineered discipline, and a view change of its goal from just error discovery to a more comprehensive and positive view of prevention. Testing is now characterized as a broad and continuous activity throughout the development process ([44], pg.6), whose aim is the measurement and evaluation of software attributes and capabilities, and Beizer states: More than the act of testing, the act of designing tests is one of the best bug preventers known ([3], pg. 3). Testing process. Indeed, much research in the early years has matured into techniques and tools which help make such “test-design thinking” more systematic and in- corporate it within the development process. Several test process models have been proposed for industrial adoption, among which probably the “V model” is the most popular. All of its many variants share the distinction of at least the Unit, Integration and System levels for testing. More recently, the V model implication of a phased and formally documented test process has been argued by some as being inefficient and unnecessarily bureaucratic, and in contrast more agile processes have been advocated. Con- cerning testing in particular, a different model gaining at- tention is test-driven development (TDD)[46], one of the core extreme programming practices. The establishment of a suitable process for testing was listed in FOSE2000 among the fundamental research topics and indeed this remains an active research today. Test criteria. Extremely rich is the set of test criteria de- vised by past research to help the systematic identification of test cases. Traditionally these have been distinguished between white-box (a.k.a. structural) and black-box (a.k.a. functional), depending on whether or not the source code is exploited in driving the testing. A more refined classifica- tion can be laid according to the source from which the test cases are derived [8], and many textbooks and survey arti- cles (e.g., [89]) exist that provide comprehensive descrip- tions of existing criteria. Indeed, so many criteria among which to choose now exist, that the real challenge becomes the capability to make a justified choice, or rather to under- stand how they can be most efficiently combined. In recent years the greatest attention has been turned to model-based testing, see Section 5.2. Comparison among test criteria. In parallel with the investigation of criteria for test selection and for test ade- quacy, lot of research has addressed the evaluation of the relative effectiveness of the various test criteria, and espe- cially of the factors which make one technique better than another at fault finding. Past studies have included several analytical comparisons between different techniques (e.g., [31, 88]). These studies have permitted to establish a sub- sumption hierarchy of relative thoroughness between com- parable criteria, and to understand the factors influencing the probability of finding faults, focusing more in partic- ular on comparing partition (i.e., systematic) against ran- dom testing. “Demonstrating effectiveness of testing tech- niques” was in fact identified as a fundamental research challenge in FOSE2000, and still today this objective calls for further research, whereby the emphasis is now on em- Future of Software Engineering(FOSE'07) 0-7695-2829-5/07 $20.00 © 2007 pirical assessment. Object-oriented testing. Indeed, at any given period, the dominating paradigm of development has catalyzed test- ing research for adequate approaches, as we further de- velop in Section 5.5. In the 90’s the focus was on test- ing of Object-oriented (OO) software. Rejected the myth that enhanced modularity and reuse brought forward by OO programming could even prevent the need for testing, researchers soon realized that not only everything already learnt about software testing in general also applied to OO code, but also OO development introduced new risks and difficulties, hence increasing the need and complexity of testing [14]. In particular, among the core mechanisms of OO development, encapsulation can help hide bugs and makes test harder; inheritance requires extensive retesting of inherited code; and polymorphism and dynamic bind- ing call for new coverage models. Besides, appropriate strategies for effective incremental integration testing are required to handle the complex spectrum of possible static and dynamic dependencies between classes. Component-based testing. In the late 90’s, component- based (CB) development emerged as the ultimate approach that would yield rapid software development with fewer resources. Testing within this paradigm introduced new challenges, which we would distinguish between technical and theoretical in kind. On the technical side, components must be generic enough for being deployed in different plat- forms and contexts, therefore the component user needs to retest the component in the assembled system where it is deployed. But the crucial problem here is to face the lack of information for analysis and testing of externally devel- oped components. In fact, while component interfaces are described according to specific component models, these do not provide enough information for functional testing. Therefore research has advocated that appropriate informa- tion, or even the test cases themselves (as in Built-In Test- ing), are packaged along with the component for facilitating testing by the component user, and also that the “contract” that the components abide to should be made explicit, to allow for verification. The testing of component-based systems was also listed as a fundamental challenge in FOSE2000. For a more recent survey see [70]. What remains an open evergreen problem is the theoret- ical side of CB testing: how can we infer interesting prop- erties of an assembled system, starting from the results of testing the components in isolation? The theoretical founda- tions of compositional testing still remain a major research challenge destined to last, and we discuss some directions for research in Section 5.1. Protocol testing. Protocols are the rules that govern the communication between the components of a distributed system, and these need to be precisely specified in order to facilitate interoperability. Protocol testing is aimed at veri- fying the conformance of protocol implementations against their specifications. The latter are released by standard or- ganizations, or by consortia of companies. In ce

                    本文档为【2007-Software testing research Achievements, challenges, dreams_Bertolino】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

2007-Software testing research Achievements, challenges, dreams_Bertolino

你可能还喜欢