首页 Data Stream Based Global Event Monitoring Using Pairwise Interactions

Data Stream Based Global Event Monitoring Using Pairwise Interactions

举报
开通vip

Data Stream Based Global Event Monitoring Using Pairwise Interactions J. Parallel Distrib. Comput. 68 Data-stream-based global event moni Punit Chandra, Ajay D Department of Computer Science, University of Illino fo ne 4 ute ed rac ape pro s p communication structures to determine which of the intervals being examined at ...

Data Stream Based Global Event Monitoring Using Pairwise Interactions
J. Parallel Distrib. Comput. 68 Data-stream-based global event moni Punit Chandra, Ajay D Department of Computer Science, University of Illino fo ne 4 ute ed rac ape pro s p communication structures to determine which of the intervals being examined at any time may never satisfy the stipulated interaction type for that pair of processes, and therefore that interval(s) need no longer be considered as forming a part of any solution. Based on this theory, the paper proposes two on-line distributed algorithms to solve the problem. c© 2008 Elsevier Inc. All rights reserved. Keywords: Causality; Data streams; Data fusion; Distributed system; Global state; Interactions; Intervals; Snapshot; State observation 1. Introduction A global state is represented by a collection of local states, one from each process. The problem of global state observation in a consistent manner is fundamental to distributed systems, as identified by Chandy’s and Lamport’s seminal paper on recording global states [14]. Furthermore, the problem of monitoring and analyzing data streams is becoming increasingly important, particularly for crisis management (see [13,38]). Often, data streams have to be analyzed on-line [1,2] under different constraints and modalities. For example, I Some portions of this paper have appeared in [P. Chandra, A.D. Kshemkalyani, Detection of orthogonal interval relations, in: Proceedings 9th International High Performance Computing Conference (HiPC), in: Lecture Notes in Computer Science, vol. 2552, Springer-Verlag, 2002, pp. 323–333; P. Chandra, A.D. Kshemkalyani, Analysis of interval-based global state much research focuses on data streams in sensor networks [19, 31,32], and in generic middleware roles [30,36]. In this paper, we consider applications that require the con- sistent observation and continuous processing of data in the data streams. The consistent observation of the global state [14,24] requires observing the causality constraints among the events in the execution [28]. This paper generalizes the problem of global state observation to executions where the “event” at a process is a high-level abstract event that can contain multiple events that span across a time interval. The interval has the property that a predicate defined on local variables is always true in the interval and is false immediately before and immediately af- ter the interval. The semantics of the interval depend on the predicate which is application-specific [17,18,20,22,25,27,29]; application areas such as sensor networks, distributed debug- ging, deadlock characterization [26], predicate detection [7,11, 12], checkpointing [3–5,33,34], and industrial process control Received 14 November 2006; received in revised Available onli Abstract The problem of global state observation is fundamental to distrib distributed systems can be analyzed in terms of the building block form causality-based pairwise interactions by which two processes may inte pair of processes (Pi , P j ), let interaction type ri, j be of interest. This p is specified in terms of such pairwise interaction types, one per pair of a global state in which the interaction type specified for each proces detection, in: Proceedings 2nd Int. Conference on Distributed Computing and Internet Technology (ICDCIT), in: Lecture Notes in Computer Science, vol. 3816, Springer, 2005, pp. 203–216; P. Chandra, A.D. Kshemkalyani, Global state detection based on peer-to-peer interactions, in: Proceedings IFIP Int. Conference on Embedded and Ubiquitous Computing (EUC), in: Lecture Notes in Computer Science, vol. 3824, Springer, 2005, pp. 560–571]. ∗ Corresponding author. E-mail address: ajayk@cs.uic.edu (A.D. Kshemkalyani). 0743-7315/$ - see front matter c© 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jpdc.2008.01.006 (2008) 729–751 www.elsevier.com/locate/jpdc toring using pairwise interactionsI . Kshemkalyani∗ is at Chicago, Chicago, IL 60607, United States rm 16 October 2007; accepted 23 January 2008 March 2008 d systems and to the analysis of data streams. Many interactions in by the pairwise interactions of intervals at two processes. Considering t with each other, there are 40 orthogonal interaction types. For each r examines the problem: “If a global state of interest to an application cesses, how can such a global state be detected?” A solution identifies air is satisfied. This paper formulates the specific conditions on the model such intervals. Such high-level abstract events at pro- cesses and the corresponding time intervals that they span have been explicitly studied [18,20,21,29]. Fig. 1(a) shows a consistent global state GS2 and an inconsistent global state GS1 in a distributed execution where events are modeled as atomic send, receive, or internal events, ra te. Relations [6,22]. Problem processes identify t that each Devisi DOOR i at differe relations states. In that can b distribute 1) which pair of ntified as ), gives a usality in identified R (also [7]. The DOOR: Given a relation ri, j from R for each pair of Pi and Pj , devise a distributed on-line algorithm to he intervals, if they exist, one from each process, such relation ri, j is satisfied by the (Pi , Pj ) process pair. ng an efficient on-line algorithm to solve problem s a challenge because of having to track the intervals nt processes and to determine the pairwise orthogonal in a search space of an exponential number of global this paper, we first identify the underlying principles 4. The process of devising the principle (Theorem determines whether at least one interval in any intervals being examined at any time can be ide never forming a part of any solution (Lemma 4 deeper insight into the nature of reasoning with ca a distributed execution. Schwarz and Mattern have this as an important problem [37]. A centralized algorithm to solve problem DOO referred to as problem Fine Rel,) was given in 730 P. Chandra, A.D. Kshemkalyani / J. Pa Fig. 1. (a) Event positions determine the consistency of a global sta as per the traditional model of a distributed execution [28]. Fig. 1(b) shows high-level abstract events that may contain multiple events and span across a time interval. The intervals are shown as a rectangle. An application-specific predicate defined at each process becomes true at the start of each interval shown and it becomes false at the end of each interval shown. Intervals X , Y , and Z occur at Pi , Pj , and Pk , respectively. Observe that if interval Y were to begin at event a marked by a vertical line, the inherent causality-based interaction type of Y with respect to Z as well as with respect to X would be different from what is shown. Thus, the relative placement of one interval with respect to another defines different interaction types between the interval pair. It has been observed that causality-based interactions in distributed executions can be analyzed in terms of the building block formed by point-to-point pairwise interactions of intervals at two processes [20]. A detailed analysis of the causality-based pairwise interactions by which two processes may interact with each other identified 29 (40) causality- based orthogonal interaction types, denoted as R, under the dense (and non-dense) time model, respectively [20]. The orthogonality of the interaction types in R implies that no interaction type can be expressed in terms of the other interaction types. Each interaction type in R between a pair of intervals is essentially a relationship between the two intervals. Hence, the interaction types in R are also interchangeably termed as relations. For each pair of processes (Pi , Pj ), let interaction type ri, j be of interest. This paper examines the state detection problem: “If a global state of interest to an application is specified in terms of such pairwise interaction types, one per pair of processes, how can such a global state be detected?” For any relationship r ∈ R, let ri, j (X i , Y j ) denote that r holds for interval X i at process Pi and interval Y j at process Pj . The above state detection problem is formally formulated as the following problem DOOR for the Detection of Orthogonal e used to solve problem DOOR. We then propose two d algorithms to solve the problem. llel Distrib. Comput. 68 (2008) 729–751 (b) Interval positions define the interaction types in a global state. Summary of results and contributions: 1. To devise any efficient solution to problem DOOR, this paper formulates specific conditions on the structure of the causal communication patterns to determine which of two intervals being examined from processes Pi and Pj may never satisfy ri, j , and therefore that interval will never form part of any solution and should no longer be considered. This result is embodied as: • a basic principle that we prove in Theorem 1 — the main result, and • Lemma 4 — a useful lemma derived from the above theorem, that we will use to efficiently manage the distributed data structures in solving problem DOOR. Any algorithms to solve DOOR can leverage this principle. 2. The paper proposes two distributed on-line algorithms to solve problem DOOR, based on the above formulated conditions to determine when an interval no longer needs to be considered as a candidate for a solution. Let n be the number of processes, ms be the total number of messages sent, mr be the total number of messages received, and p be the maximum number of intervals at any process. For unicast communication, ms = mr . The first distributed algorithm uses O(min(np, 2ms+2mr )) number of messages with a message size of O(n2). The second algorithm uses O(n · min(np, 2ms + 2mr )) number of messages with a message size of O(n). For both the algorithms, the total space complexity across all the processes is min(4n2 p − 2np, 6msn+4mrn), and the total time complexity across all processes is O(n · min(np, 2ms + 2mr )). The performance of the algorithms is compared in Table 1. 3. Global state observation and predicate detection are fundamental problems in distributed systems. This paper provides an understanding of interval-based global states in terms of the causal communication patterns induced by the message-passing interactions in an execution [25]. algorithm was presented without any formal discussion or analysis of the theoretical basis — embodied here in el ms n + m 4 )) 2 2m gives the theory used to determine which of two given intervals at each process are the durations during which an application- at different processes can never be part of a solution set. Section 4 shows how to track intervals. Sections 5 and 6 present the distributed algorithms to solve Problem DOOR based on the results derived in Section 3. Section 7 gives an extended specification of DOOR. Section 8 gives concluding remarks. 2. System model and background System model. We assume an asynchronous distributed system in which n processes communicate by reliable message passing over logical FIFO channels [20,29]. The execution is modeled as (E,≺), where ≺ is an irreflexive partial ordering specific local predicate is true. See Fig. 2. An interval begins when the predicate becomes true, and ends when/just before the predicate becomes false. Each interval defines an abstract event of coarser granularity at a process, as studied by Lamport [29], Helary et al. [18], and Kshemkalyani [27]. Such higher-level abstract events can be used to identify a global state [27]. For example, in the context of checkpointing [4,5,15,33,34], ψi can be used to describe the kth checkpoint interval at Pi . To characterize deadlock [26,25], ψi denotes the interval from the time of incoming wait-for dependency to the time of an outgoing wait-for dependency, or vice versa; such intervals at different processes capture how dependency chains grow. For P. Chandra, A.D. Kshemkalyani / J. Parall Table 1 Summary of space, time, and message complexities Metric Algorithm 1 Total space complexity O(min(2np(2n − 1), 6 Space per process (worst case) O(min(4np − 2p, 4ms Total time complexity O(n ·min(np, 2ms + 2 Time per process (worst case) O(n ·min(10p, 20ms + Total number of messages O(min(np, 2ms + 2mr Message size (average case) O(n2) Message size (worst case) O(n ·min(2(n − 1)p + Total message space O(n2 ·min(np, 2ms + Fig. 2. Intervals at processes. Messages and send and receive events are not shown to simplify the diagram. Theorems 1–3 and Lemma 4. That paper also showed how to perform predicate detection in the context of traditional modalities on predicates [12]. This paper gives the theory behind the results of [7] and then gives two distributed algorithms for DOOR. Distributed algorithms are more elegant than centralized algorithms and do not require a central server. Distributed algorithms result in a better workload and space complexity distribution as compared to a centralized one. The space complexity and the time complexity get distributed linearly with the number of processes. Thus, a distributed algorithm is more scalable than a centralized algorithm. Also, the network traffic gets distributed more uniformly in a distributed algorithm compared to the centralized approach where a traffic bottleneck gets created at the central process. Preliminary versions of this paper appeared in [6,8,9]. Section 2 gives the system model and background. Section 3 representing the causality or the “happens before” relation [28] on the event set E . Distrib. Comput. 68 (2008) 729–751 731 Algorithm 2 n + 4mrn)) O(min(2np(2n − 1), 6msn + 4mrn)) 2ms )) O(min(4np − 2p, 4msn + 2ms )) r )) O(n ·min(np, 2ms + 2mr )) mr )) O(n ·min(10p, 20ms + 4mr )) O(n ·min(np, 2ms + 2mr )) O(n) n, 2ms + 2n)) O(min(2(n − 1)p + 2n, 2ms + 2n)) r )) average O(n2 ·min(4np − 2p, 6ms + 4mr )) Definition 1 (Causality or the “happens before” Relation ≺). Event e happens before event e′, denoted e ≺ e′, if: 1. event e occurs before event e′ at the same process, or 2. event e is the send of a message m and event e′ is the receive of m, or 3. there is an event e′′ such that e happens before e′′ and e′′ happens before e′. E is partitioned into local executions at each process. Let Ei denote the linearly ordered set of events executed by process Pi . An event e executed by Pi is denoted ei and may be of three types — an internal event, a send event, or a receive event. N denotes the set of all processes. A cut C is a subset of E such that if ei ∈ C then (∀e′i ) e′i ≺ ei H⇒ e′i ∈ C . A consistent cut is a downward-closed subset of E , i.e., if e ∈ C then (∀e′) e′ ≺ e H⇒ e′ ∈ C . A consistent cut denotes an execution prefix. Two special consistent cuts ↓ e and e ↑ can be defined for any event e. Definition 2 (Past and Future Cuts). For any event e, cut ↓e is the set of events {e′|e′ ≺ e} that happen before e. Cut e↑ is the set of events {e′|e′ 6� e}⋃{ei , i = 1, . . . , n | ei � e∧ (∀e′i ≺ ei , e′i 6� e)} up to and including the earliest events at each process for which e happens before the events. The system state after the events in a cut is a global state [14]; if the cut is consistent, the corresponding system state is a consistent global state. We assume that vector clocks are available [16,35]. Intervals. As introduced in Section 1, the intervals of interest conjunctive predicate detection [7,12], ψi is the predicate on the local variables that the application wants to detect. ra ec , Y −1 IC (= IV ) 0 0 1 1 1 0 0 0 0 0 0 0 ID (= IX−1) 0 0 1 1 1 1 0 1 0 1 0 0 ID′(= IU−1) 0 0 1 1 0 1 0 1 0 1 0 1 IE (= IW−1) 0 0 1 1 1 1 0 0 0 1 0 0 IE′(= IT−1) 0 0 1 1 0 1 0 0 0 1 0 1 IF (= IS−1) 0 1 1 1 0 1 0 0 0 1 0 1 IG (= IG−1) 0 0 0 0 1 0 0 0 0 0 1 0 IH (= IK−1) 0 0 0 1 1 0 0 0 0 0 1 0 II (= IJ−1) 0 1 0 1 0 0 0 0 0 0 1 0 IL (= IO−1) 0 0 0 1 1 1 0 1 0 1 0 0 IL′ (= IP−1) 0 0 0 1 0 1 0 1 0 1 0 1 IM (= IM−1) 0 0 0 1 1 0 0 0 0 1 1 0 IN (= IM′−1) 0 0 0 1 1 1 0 0 0 1 0 0 IN′ (= IN′−1) 0 0 0 1 0 1 0 0 0 1 0 1 ID′′ (= (IUX)−1) 0 0 1 1 0 1 0 1 0 1 0 0 IE′′ (= (ITW)−1) 0 0 1 1 0 1 0 0 0 1 0 0 IL′′ (= (IOP)−1) 0 0 0 1 0 1 0 1 0 1 0 0 IM′′ (= (IMN)−1) 0 0 0 1 0 0 0 0 0 1 1 0 IN′′ (= (IMN′)−1) 0 0 0 1 0 1 0 0 0 1 0 0 IMN′′ (= (IMN′′)−1) 0 0 0 1 0 0 0 0 0 1 0 0 The upper part gives the 29 interaction types for dense time. The lower part gives 11 additional interaction types for non-dense time. In our discrete event system model, an interval X i at process Pi is identified by the (totally ordered) subset of adjacent events of Ei , beginning from the event that makes the predicate true up to the event that precedes the event that makes the predicate false. Intervals are denoted by capitals such as X , Y , and Z . The subscripts are omitted when not necessary or when the context is clear. Lower-case alphabet x denotes an individual event in an abstract event X . Definition 3 (Interval). For a predicate ψi defined on process Pi , an interval X i (ψi ) ⊆ Ei , satisfies the following. • ∀ei ∈ Ei , if min(X i ) ≺ ei ≺ max(X i ) then ei ∈ X i • ψi becomes true at min(X i ) and becomes false at next(max(X i )) • ∀xi such that min(X i ) ≺ xi ≺ max(X i ), xi does not falsify ψi . The execution history at Pi is 〈s0i , e1i , s1i , e2i , s2i . . . eki , k k k true. ψi is application-specific and so we henceforth refer to intervals without using this parameter. Orthogonal interaction types/relationships. There are 29 or 40 possible mutually orthogonal ways in which any two durations can be related to each other, depending on whether the dense or the non-dense time model is assumed [20]. Informally, with dense time, ∀z1, z2 in interval Z , z1 ≺ z2 H⇒ ∃z ∈ Z | z1 ≺ z ≺ z2. These orthogonal interaction types were identified by first using the six dependent relations defined in the first two columns of Table 2. Relations R1 (strong precedence), R2 (partially strong precedence), R3 (partially weak precedence), R4 (weak precedence) define causality conditions; S1 and S2 define coupling conditions. The tests in the third column are explained later in this section. • (Dense time:) The 29 orthogonal interaction types between a pair of intervals are given in the upper part of Table 3. For any intervals X and Y , the orthogonal interaction types 732 P. Chandra, A.D. Kshemkalyani / J. Pa Table 2 Dependent relations for interactions between intervals [20] Relation r Expression for r(X, Y ) R1 ∀x ∈ X∀y ∈ Y, x ≺ y R2 ∀x ∈ X∃y ∈ Y, x ≺ y R3 ∃x ∈ X∀y ∈ Y, x ≺ y R4 ∃x ∈ X∃y ∈ Y, x ≺ y S1 ∃x ∈ X∀y ∈ Y, x 6� y∧ y 6� x S2 ∃x1, x2 ∈ X∃y ∈ Y, x1 ≺ y ≺ x2 The second column defines the relations. The third column gives the tests using v Table 3 Definitions of the 40 orthogonal interaction types inR [20] Orthogonal interaction type (on intervals X and Y ) Dependent relation r(X R1 R2 R3 IA (= IQ−1) 1 1 1 IB (= IR−1) 0 1 1 si , . . .〉, where si is the state after event ei . So the definition of X i attempts to capture the time duration in which ψi remains llel Distrib. Comput. 68 (2008) 729–751 Test for r(Xi , Y j ) using vector timestamps V−j (Y j )[i] ≥ V+i (Xi )[i] V+j (Y j )[i] ≥ V+i (Xi )[i] V−j (Y j )[i] ≥ V−i (Xi )[i] V+j (Y j )[i] ≥ V−i (Xi )[i] ∃x0 ∈ Xi : V−j (Y j )[ j] 6≤ V x 0 i [ j] ∧ V x 0 i [i] 6≤ V+j (Y j )[i] ∃y0 ∈ Y j : V+i (Xi )[ j] 6< V y 0 j [ j] ∧ V y 0 j [i] 6< V−i (Xi )[i] tor timestamps. ) Dependent relation r(Y, X) R4 S1 S2 R1 R2 R3 R4 S1 S2 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 listed in the first column are specified using boolean vectors of length 12 on the dependent relations R1–R4 and S1–S2 lel es IC(X i , Y j ), IA(Zk, Y j ), and IX(Zk, X i ), or alternately, in t O s A s a I a E i r a i v c t U e theorem in the form of Lemma 4 will be used in practice by erms of inverses, IV(Y j , X i ), IQ(Y j , Zk), and ID(X i , Zk). bserve using Fig. 3 that the intervals in Fig. 1(b) satisfy this pecification on the interval-based global state. nother example specification of DOOR: Detect a global tate satisfying IE(X i , Y j ), IP(Zk, Y j ), and IX(Zk, X i ),
本文档为【Data Stream Based Global Event Monitoring Using Pairwise Interactions】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_036759
暂无简介~
格式:pdf
大小:2MB
软件:PDF阅读器
页数:0
分类:互联网
上传时间:2010-11-06
浏览量:10