Formalising the L4 microkernel API

Formalising the L4 microkernel API Formalising the L4 microkernel API Rafal Kolanski Gerwin Klein National ICT Australia Ltd. (NICTA) Locked Bag 6016 The University of New South Wales Sydney NSW 1466 Australia Email: {rafal.kolanski|gerwin.klein}@nicta.com.au Abstract This paper gives an...

Formalising the L4 microkernel API Rafal Kolanski Gerwin Klein National ICT Australia Ltd. (NICTA) Locked Bag 6016 The University of New South Wales Sydney NSW 1466 Australia Email: {rafal.kolanski|gerwin.klein}@nicta.com.au Abstract This paper gives an overview of a pilot project on the specification and verification of the L4 high- performance microkernel. Of the three aspects ex- amined in the project, we describe one in more de- tail: the formalisation of the kernel’s Application Pro- gramming Interface using the B Method. We con- clude that machine-supported formal verification of software is at a turning point; that it is now feasible, and desirable, to formally verify production-quality operating systems. Keywords: B Method, Operating System Specifica- tion, Software Verification 1 Introduction The operating system (OS) kernel is defined to be the part of the OS that runs in the privileged mode of the hardware and thus is able to bypass hardware protec- tion mechanisms. A microkernel is a kernel designed to be minimal in code size and concepts. L4 is a second generation microkernel [14]. It pro- vides the traditional advantages of the microkernel approach to system structure, namely improved re- liability and flexibility, while overcoming the perfor- mance limitations of the previous generation of mi- crokernels. With implementation sizes in the order of 10,000 lines of C++ and assembler code it is an order of magnitude smaller than Mach and two orders of magnitude smaller than Linux. The correctness and reliability of any nontrivial system clearly critically depends on the operating sys- tem and its kernel. In terms of security, the OS is part of the trusted computing base, that is, the hardware and software necessary for the enforcement of a sys- tem’s security policy. It has been repeatedly demon- strated that current operating systems fail at correct- ness, reliability, and security. Microkernels address the problem by applying the principles of minimal- ity and least privilege to OS architecture. To gain confidence in the overall system, it is therefore highly desirable to formally verify the correctness of this de- sign and its implementation. Copyright c©2006, Australian Computer Society, Inc. This pa- per appeared at Computing: The Australasian Theory Sympo- sium (CATS2006), Hobart, Australia. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 51. Barry Jay and Joachim Gudmundsson, Eds. Reproduction for academic, not-for profit purposes permitted provided this text is included. National ICT Australia is funded by the Australian Govern- ment’s Department of Communications, Information Technol- ogy, and the Arts and the Australian Research Council through Backing Australia’s Ability and the ICT Research Centre of Excellence programs. Figure 1: Overview The L4 kernel is of a size which makes formali- sation and verification feasible. Compared to other OS kernels, L4 is very small; compared to the size of other verification efforts, 10,000 lines of code is still considered a very large and complex system. Our methodology for solving this verification problem is shown in figure 1. It is a classic refinement strategy. We start out from an abstract model of the kernel that is phrased in terms of user concepts as they are explained in the L4 reference manual [13]. This is the level at which most of the safety and security the- orems will be shown. We then formally refine this abstract model in multiple property preserving steps towards the implementation of L4. The last step con- sists of verifying that the C++ and assembler source code of the kernel correctly implements the most con- crete refinement level. At the end of this process, we will have shown that the kernel source code satis- fies the safety and security properties we have proved about the abstract model. We conducted a pilot project to judge the feasibil- ity of this verification task. The project investigated three main aspects: a formalisation of the kernel’s Application Programming Interface (API) using the B Method (the first horizontal formalisation layer in figure 1), a full refinement proof for a non-trivial sub- system of the kernel using Isabelle/HOL (the vertical slice in figure 1), and a literature survey on formalis- ing safety and security properties on the design level (the right-hand side of figure 1). In this paper we give an overview of the first of these aspects: the API formalisation using the B Method, depicted as abstract model in figure 1. The L4 API provides three basic abstractions: threads, synchronous inter process communication (IPC), and virtual memory management (VMM). Our formal- isation covers threads and IPC in detail and con- tains the basic structure for VMM. The latter has been formalised in depth in the vertical slice part of the project and is already described in earlier pub- lications [20, 10]. Our formalisation is based on re- lease version 0.3 of the L4Ka::Pistachio implementa- tion [12]. We chose the B Method [1], because there existed a significant amount of experience with this approach among our student population and we wanted to com- pare at least two different formalisms before embark- ing on the full verification task. The B Method is a formal development methodology based on set the- ory with first-order logic. It allows progress from an initial high-level specification all the way to imple- mentation via formal refinement. In this part of the project we have not done any formal refinement, but used the B Method and tool for formalisation only. The B Toolkit [2] allows for animation of the top-level specification which makes validating the specification more convenient. In this mode, the user becomes the implementation of all non-deterministic or undefined aspects. After reviewing related work in section 2 and in- troducing B concepts and notation in section 3, we describe our formalisation of the L4 API in section 4. Section 5 gives pointers to further work and con- cludes. 2 Related Work Earlier work on operating system kernel formalisation and verification includes PSOS [15] and UCLA Secure Unix [22]. The focus of this work was on capability- based security kernels, allowing security policies such as multi-level security to be enforced. These efforts were hampered by the lack of mechanisation and ap- propriate tools available at the time and so while the designs were formalised, the full verification proofs were not practical. Later work, such as KIT [3], describes verification of properties such as process isolation to source or object level but with kernels providing far simpler and less general abstractions than modern microkernels. There exists some work in the literature on the modelling of microkernels at the abstract level with varying degrees of complete- ness. Bevier and Smith [4] specify legal Mach states and describe Mach system calls using temporal logic. Shapiro and Weber [18] give an operational seman- tics for EROS and prove a confinement security pol- icy. A number of case studies [6, 5, 21] in the lit- erature describe the IPC and scheduling subsystems of microkernels in PROMELA and verify the formal descriptions with the SPIN model checker. These ab- stractions were not necessarily sound, having been manually constructed from the implementations, and so while useful for discovering concurrency bugs do not provide guarantees of correctness. The VFiasco project, working with the Fiasco im- plementation of L4, has published exploratory work on the issues involved in C++ verification at the source level [9]. The VeriSoft project [8] is attempting to verify a whole system stack, including hardware, compiler, applications, and a simplified microkernel called VAMOS that is inspired by, but not very close to L4. While the simplifications are appropriate for the goals of VeriSoft, it is doubtful that the VAMOS kernel will show the necessary performance to be rel- evant for industrial use. Spivey uses Z, a predecessor formalism of B, to specify a simple kernel for a safety-critical X-ray diag- nostic machine [19]. In abstracting the kernel from its implementation and documenting it for future reim- plementations (possibly on different architectures), he finds a flaw in the system that could potentially have caused the X-ray machine to inflict damage. The more academic approach of using formal de- sign and specfication in the kernel development pro- cess up front and then proceeding with the imple- mentation is utilised to good effect by Fowler and Wellings [7] for an Ada95 runtime support system in a hard real-time environment. From the verification perspective, this approach is more efficient than the post-hoc formalisation that is commonly found and which we are presenting here. The drawback is that while this process ensures a correct kernel, it is hard to get the runtime performance that separates prac- tical microkernels from impractical ones. In fact, we propose to do post-hoc formalisation of the existing L4 microkernel whose architecture has proven to deliver the required performance, but for the verification task itself we reserve the freedom to change details in the code base that make the verifi- cation process easier. 3 Notation At the top specification level, the B Method uses ma- chines, which represent finite state automata. Refine- ments further refine these, and implementations are the most concrete in the chain. Since our formalisa- tion is entirely contained at the top level, we will only describe machine notation. A machine consists of the following sections: DEFINITIONS They are purely syntactic transla- tions. Any single-letter token counts as a so- called joker and can represent any set of tokens, similar to #define in C and C++. One defi- nition cannot use another one within the same machine. VARIABLES A comma separated list of variables. SETS Enumerations and abstract sets. CONSTANTS Declares constant sets, members of sets, or functions (which are also represented as sets). PROPERTIES Restrictions on sets and constants. INVARIANT The invariants of the machine, used to define variable types, properties, and relation- ships. INITIALISATION Initial values for variables. OPERATIONS The state transitions of the ma- chine. At the abstract machine level, only par- allel composition is allowed, i.e. all statements in the operation (including invoking other oper- ations) occur at the same time; the operation itself is instantaneous. An operation may only invoke operations in other machines, and only when permitted by inter-machine relationships. Operations can have preconditions. The relationship between machines is restricted. A machine may INCLUDE (read-access to everything plus invoking operations), or SEE (read-access only to sets and constants) other machines. Write-access is only permitted through operations. If machine X includes Y, it can select which of B’s operations are visible when another machine includes X with PRO- MOTES. Since there is only one name space in B, we use naming conventions to avoid collisions. We use pre- fixes for all enumerations and a ‘d’ prefix for most definitions, as well as long classifying names such as thread ipc waiting timeout. Robinson provides a good reference to B syntax [17]. We define a few core notational concepts here and explain other non-standard notation as it occurs. B supplies built-in sets such as the natural num- bers, NAT and NAT1 (N − {0}). The library also provides machines defining sets such as INTEGER and BOOL = {FALSE, TRUE}. Two frequently used operations are the relational image and domain restriction. The relational image of set S under relation r is defined as: r [ S ] = { y | ∃x · x ∈ S ∧ x 7→ y ∈ r } If r is a function, this means all y such that r(x) = y. The pair (x, y) is denoted x 7→ y. For a relation r and set S, the domain restriction operator (B) is defined as follows: r B S = {x 7→ y | x 7→ y ∈ r ∧ y ∈ S } The predefined functions dom and ran return the domain and range of relations and functions; card is the the cardinality of a set. 4 The Formalisation This section describes the formalisation of the L4 mi- crokernel API. As far as possible, we will introduce the kernel concepts together with their formal coun- terparts. The goals of the formalisation were the fol- lowing. Learning Animation of the model to improve and accelerate understanding of L4 internals for new users and developers. Documentation L4 is continually developed and improved for efficiency. Boiling down the sys- tem to its essentials in the form of an abstract model will help to document and clarify the ini- tial intentions and underlying basic mechanisms. Experimentation The current version of the L4 API lacks an efficient communication restriction for information flow and also is vulnerable to denial of service attacks on kernel memory re- sources. One of the follow-up projects to this formalisation is revising the L4 API to fix these shortcomings. It is one of the goals of the formal- isation presented here to serve as an experiment in kernel modelling to find out which methods work well. The API is the boundary between user space and kernel space. When building a model, the question is Whose viewpoint do we model? From the perspective of a thread running in the system, kernel operations are system calls. They re- turn values and they return them immediatly. From the kernel’s perspective, however, internal state changes are visible. For instance, the kernel might pick out the parameters from the thread’s reg- isters and memory, then pass them to an internal op- eration which implements the required functionality. The operation does not have to return immediately. The kernel can freeze the thread, change its state, put it on a waiting queue and so forth. The system call also does not simply return a value internally, but in- stead copies return values into the thread’s registers and memory. To document the kernel behaviour in detail, we chose the second viewpoint. It allows, for example, modelling of thread state transitions in a natural way. Taking the inside view does not mean that we are exposing implementation details of the API — the formalisation remains at the conceptual level of a ref- erence manual. The API for instance already implies that the kernel manages thread control states and the formalisation describes how these states are affected by operations. The formalisation does not describe which data structures are used to implement thread states in the kernel. Context MachinesWeakSysCall IpcBase IpcCore Thread AddressSpace API Figure 2: Inclusion diagram for the B development. Figure 2 shows the module structure of the for- malisation. Microkernels strive for the minimal set of functionality that is sufficient to build an OS. This means that the L4 kernel is effectively a single mod- ule in which everything is intertwined. We were able to separate out B modules for each of the major sub- systems (address spaces, threads, IPC), but as figure 2 shows, they still depend on each other. In the formalisation, we have placed all types and constants in separate context machines. There is one such context machine for each machine containing op- erations. In this presentation, we summarise these under the label ContextMachines and describe them in section 4.1. The rest of the presentation follows figure 2 bottom up. In section 4.2 we describe the address space stub (this subsystem has already been formalised separately), in section 4.3 the concept of threads and in section 4.4 the inter process commu- nication subsystem. We leave out the description of WeakSysCall which collects these together into the L4 API functions, but does not contain any precondition checking yet. Section 4.5 describes the final inter- face available to the user. Due to space constraints, our description does not cover all of the formalisation at the same level of detail. The full formalisation is available elsewhere [11]. 4.1 Context Machines This section defines the basic types and constants used in the rest of the formalisation. In addition to type information, all systems man- age a finite set of resources. By defining abstract sets of things (such as thread numbers) and restricting their cardinality, we implicitly define an upper limit on the number of such things in the system. In L4, a structure called the Kernel Information Page (KIP) contains all the constant values in the system (how many interrupts, first id of a user thread, etc.) The context machines serve a similar purpose. We start off with the three main limiting aspects of the kernel: the number of threads in the sys- tem (kMaxThreads), the number of address spaces in the system (kMaxAddressSpaces), and the number of threads in an address space (kMaxThreadsPerSpace). These three constants have the following properties: PROPERTIES kMaxThreads ∈ N1 ∧ 3 ≤ kMaxThreads ∧ kMaxThreadsPerSpace ∈ N1 ∧ kMaxAddressSpaces ∈ N1 ∧ 3 ≤ kMaxAddressSpaces Each thread must have an address space; an ad- dress space can only be created by also creating a thread [13, section 2.4]. There are three address spaces initially in the system: the sigma0 space, the root server space and the kernel space. The minimum number of address spaces is therefore 3, and the same goes for threads. Hence, the maxima must be at least 3, too. In order to talk about address spaces within the model, we define the abstract set of all possible ad- dress spaces and restrict them to the maximum num- ber of address spaces in the system: SETS ADDRESS SPACE PROPERTIES card ( ADDRESS SPACE ) = kMaxAddressSpaces ∧ kRootServerSpace ∈ ADDRESS SPACE ∧ kSigma0Space ∈ ADDRESS SPACE ∧ kKernelSpace ∈ ADDRESS SPACE ∧ kRootServerSpace 6= kSigma0Space ∧ kSigma0Space 6= kKernelSpace ∧ kRootServerSpace 6= kKernelSpace In the above, we define three new constants. Their function is to reserve three arbitrary members of AD- DRESS SPACE for the three core address spaces mentioned before: kSigma0Space, kRootServerSpace, and kKernelSpace. These address spaces have special status in L4, they are privileged: DEFINITIONS dIsPrivilegedSpace ( s ) b= s ∈ { kSigma0Space , kRootServerSpace , kKernelSpace } The following constants describe the control states that threads can experience in L4: tsAborted the thread exists, but has not been ini- tialised tsRunning the thread has been initialised and if scheduled, can run tsPolling thread is waiting on an IPC send to an- other thread tsWaitingTimeout thread is waiting for incoming IPCs from one or more threads, with a finite time-out tsWaitingForever as above, but the time-out is in- finite Figure 3 presents an overview of the possible tran- sitions between these states. We show a complete diagram below in figure 4, section 4.4.3. These states differ from the ones in the L4 imple- mentation in following ways: • Multiprocessing-related states are missing since our model is too abstract to demonstrate effects of multiple-CPU interaction; • The halted state is missing. According to discus- sions with the L4 developers, this state is better modelled by a flag. As defined in [13, section 2.3], halting a thread prevents it from execut- ing in user mode, while ongoing IPC is not af- fected. This means that it simply prevents the thread from being scheduled. Furthermore, the ExchangeRegisters system call needs to re- sume halted threads, creating the need for an- other (saved) thread state. This preserves func- tionality, but makes for a simpler model; Figure 3: A simplified diagram of possible thread state transitions. • The aborted state has a slightly different mean- ing than in the L4 implementation. In L4, all kernel thread control blocks are preallocated and their initial state is aborted. When a thread gets created inactive, the state remains aborted. The actual existence of a thread is

                    本文档为【Formalising the L4 microkernel API】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

Formalising the L4 microkernel API

你可能还喜欢