首页 10.陈启峰《Size Balanced Tree》

10.陈启峰《Size Balanced Tree》

举报
开通vip

10.陈启峰《Size Balanced Tree》 Size Balanced Tree Chen Qifeng (Farmer John) Zhongshan Memorial Middle School, Guangdong, China Email:344368722@QQ.com December 29, 2006 Abstract This paper presents a unique strategy for maintaining balance in dynamically changing Bin...

10.陈启峰《Size Balanced Tree》
Size Balanced Tree Chen Qifeng (Farmer John) Zhongshan Memorial Middle School, Guangdong, China Email:344368722@QQ.com December 29, 2006 Abstract This paper presents a unique strategy for maintaining balance in dynamically changing Binary Search Trees that has optimal expected behavior at worst. Size Balanced Tree is, as the name suggests, a Binary Search Tree (abbr. BST) kept balanced by size. It is simple, efficient and versatile in every aspect. It is very easy to implement and has a straightforward description and a surprisingly simple proof of correctness and runtime. Its runtime matches that of the fastest BST known so far. Furthermore, it works much faster than many other famous BSTs due to the tendency of a perfect BST in practice. It supports not only typical primary operations but also Select and Rank. Key Words And Phrases Size Balanced Tree SBT Maintain This paper is dedicated to the memory of Heavens. 1 Introduction Before presenting Size Balanced Trees it is necessary to explicate Binary Search Trees and rotations on BSTs, Left-Rotate and Right-Rotate. 1.1 Binary Search Tree Binary Search Tree is a significant kind of advanced data structures. It supports many dynamic-set operations, including Search, Minimum, Maximum, Predecessor, Successor, Insert and Delete. It can be used both as a dictionary and as a priority queue. A BST is an organized binary tree. Every node in a BST contains two children at most. The keys for compare in a BST are always stored in such a way as to satisfy the binary-search-tree property: Let x be a node in a binary search tree. Then the key of x is not less than that in left subtree and not larger than that in right subtree. For every node t we use the fields of left[t] and right[t] to store two pointers to its children. And we define key[t] to mean the value of the node t for compare. In addition we add s[t], the size of subtree rooted at t, to keep the number of the nodes in that tree. Particularly we call 0 the pointer to an empty tree and s[0]=0. 1.2 Rotations In order to keep a BST balanced (not degenerated to be a chain) we usually change the pointer structure through rotations to change the configuration, which is a local operation in a search tree that preserves the binary-search-tree property. Figure1.1: The operation Left-Rotate(x) transforms the configuration of the two nodes on the right into the configuration on the left by changing a constant number of pointers. The configuration on the left can be transformed into the configuration on the right by the inverse operation, Right-Rotate(y). 1.2.1 Pseudocode Of Right-Rotate The Right-Rotate assumes that the left child exists. Right-Rotate (t) 1 k←left[t] 2 left[t] ←right[k] 3 right[k] ←t 4 s[k] ←s[t] 5 s[t] ←s[left[t]]+s[right[t]]+1 6 t←k 1.2.2 Pseudocode Of Left-Rotate The Left-Rotate assumes that the right child exists. Left-Rotate (t) 1 k←right[t] 2 right[t] ←left[k] 3 left[k] ←t 4 s[k] ←s[t] 5 s[t] ←s[left[t]]+s[right[t]]+1 6 t←k 2 Size Balanced Tree Size Balanced Tree (abbr. SBT) is a kind of Binary Search Trees kept balanced by size. It supports many dynamic primary operations in the runtime of O(logn): Insert(t,v) Inserts a node whose key is v into the SBT rooted at t. Delete(t,v) Deletes a node whose key is v from the SBT rooted at t. In the case that no such a node in the tree, deletes the node searched at last. Find(t,v) Finds the node whose key is v and returns it. Rank(t,v) Returns the rank of v in the tree rooted at t. In another word, it is one plus the number of the keys which are less than v in that tree. Select(t,k) Returns the node which is ranked at the kth position. Apparently it includes operations of Get-max and Get-min because Get-min is equivalent to Select(t,1) and Get-max is equivalent to Select(t,s[t]) Pred(t,v) Returns the node with maximum key which is less than v. Succ(t,v) Returns the node with minimum key which is larger than v. Commonly every node of a SBT contains key, left, right and extra but useful updated field: its size, which has been defined in the former introduction. The kernel of a SBT is divided into two restrictions on size: For every node pointed by t in a SBT, we guarantee that Property(a): ]]][[[]]],[[[]][[ tleftrightstleftleftstrights ≥ Property(b): ]]][[[]]],[[[]][[ trightleftstrightrightstlefts ≥ Figure 2.1:The nodes L and R are left and right children of the node T. The Subtrees A and B, C and D are left and right subtrees of the nodes L and R respectively. Correspond to properties (a) and (b), ][][],[&][][],[ LsDsCsRsBsAs ≤≤ 3 Maintain Assume that we need to insert a node whose key is v into a BST. Generally we use the following procedure to accomplish the mission. Simple-Insert (t,v) 1 If t=0 then 2 t←NEW-NODE(v) 3 Else 4 s[t] ←s[t]+1 5 If vs[right[t]] In this case that s[A]>s[R] correspond to Figure 2.1 after Insert(left[t],v), we can execute the following instructions to repair the SBT: (1) First perform Right-Rotate(T). This operation transforms Figure 2.1 into Figure 3.1; Figure 3.1: All the nodes are defined the same as in Figure 2.1. (2) And then sometimes it occurs that the tree is still not a SBT because s[C]>s[B] or s[D]>s[B] by any possibility. So it is necessary to implement Maintain(T). (3) In succession the configuration of the right subtree of the node L might be changed. Because of the possibility of violating the properties it is needful to run Maintain(L) again. Case 2: s[right[left[t]]>s[right[t]] In this case that s[B]>s[R] correspond to Figure 3.2 after Insert(left[t],v), the maintaining is kind of complicated than that in case 1. We can carry out the following operations for repair. Figure 3.2: All the nodes are defined the same as in Figure 2.1 except E, F and B. E and F are the subtrees of the node B. (1) First perform Left-Rotate(L). After that Figure 3.2 was transformed to Figure 3.3 shown below; Figure 3.3: All the nodes are defined the same as in Figure 3.2. (2) And then perform Right-Rotate(T). This performance results in the transformation from Figure 3.3 to following Figure 3.4. Figure 3.4: All the nodes are defined the same as in Figure 3.2. (3) After (1) and (2), the tree becomes to be very unpredictable. But luckily, in Figure 3.4 subtrees A, E, F and R are still SBTs. So we can perform Maintain(L) and Maintain(T) to repair the subtrees of the node B. (4) After step (3), those subtrees are already SBTs. But properties (a) and (b) may be still violated on the node B. Thus we need perform Maintain(B) again. Case 3: s[right[right[t]]>s[left[t]] This case is symmetrical with case 1. Case 4: s[left[right[t]]>s[left[t]] This case is symmetrical with case 2. 3.1 Pseudocode Of Standard Maintain According to the previous analysis it is easy to implement a normal Maintain. Maintain (t) 1 If s[left[left[t]]>s[right[t]] then 2 Right-Rotate(t) 3 Maintain(right[t]) 4 Maintain(t) 5 Exit 6 If s[right[left[t]]>s[right[t]] then 7 Left-Rotate(left[t]) 8 Right-Rotate(t) 9 Maintain(left[t]) 10 Maintain(right[t]) 11 Maintain(t) 12 Exit 13 If s[right[right[t]]>s[left[t]] then 14 Left-Rotate(t) 15 Maintain(left[t]) 16 Maintain(t) 17 Exit 18 If s[left[right[t]]>s[left[t]] then 19 Right-Rotate(right[t]) 20 Left-Rotate(t) 21 Maintain(left[t]) 22 Maintain(right[t]) 23 Maintain(t) 3.2 Pseudocode Of Faster And Simpler Maintain The standard pseudocode is a little complicated and slow. Generally we can ensure that property (a) or (b) has been satisfied. Therefore we only need to check cases 1 and 2 or case 3 and 4 to speed up. In that case we can add a boolean variant, flag, to avoid meaningless checking. If flag is false cases 1 and 2 will be checked, otherwise cases 3 and 4 will be checked. Maintain (t,flag) 1 If flag=false then 2 If s[left[left[t]]>s[right[t]] then 3 Right-Rotate(t) 4 Elseif s[right[left[t]]>s[right[t]] then 5 Left-Rotate(left[t]) 6 Right-Rotate(t) 7 Else exit 8 Elseif s[right[right[t]]>s[left[t]] then 9 Left-Rotate(t) 10 Elseif s[left[right[t]]>s[left[t]] then 11 Right-Rotate(right[t]) 12 Left-Rotate(t) 13 Else exit 14 Maintain(left[t],false) 15 Maintain(right[t],true) 16 Maintain(t,false) 17 Maintain(t,true) Why Maintain(left[t],true) and Maintain(right[t],false) are expelled? What is the runtime of Maintain? You can find the answers in part 6, analysis. 4 Insertion The insertion on a SBT is very simple. Here’s the pseudocode of Insert on a SBT. 4.1 Pseudocode Of Insert Insert (t,v) 1 If t=0 then 2 t←NEW-NODE(v) 3 Else 4 s[t] ←s[t]+1 5 If vkey[t])and(right[t]=0) then 3 Delete ←key[t] 4 If (left[t]=0)or(right[t]=0) then 5 t ←left[t]+right[t] 6 Else 7 key[t] ←Delete(left[t],v[t]+1) 8 Else 9 If v = = ⎪⎩ ⎪⎨ ⎧ +−+− = h h h hfhf hf 6.1.1 Proof (1) It is easy to work out that 1]0[ =f and 2]1[ =f . (2) First of all, )1(1]2[]1[][ >+−+−≥ hhfhfhf . For each h>1, let’s assume that t points to a SBT whose height is h. And then this SBT contains a subtree at the height of h-1. It doesn’t matter to suppose it as the left subtree. According to the previous definition of ][hf , we have that ]1[ . And there is an h-2-tall subtree of the left subtree. In another word there is a subtree whose size is at least ]2[ ]][[ −≥ hftlefts −hf of the left subtree. Owing to the property (b) we have that ]2[]][[ −≥trights hf . Therefore we draw the conclusion that 1]2[]1[][ +−+−≥ hfhfts . On the other hand, )1(1]2[]1[][ >+−+−≤ hhfhfhf . We can construct a SBT with exact node(s) and its height is h. We call this kind of SBT tree[h]. ][hf )1( )1( )0( subtrees twoas 2]- tree[hand 1]- tree[hcontaining BST a nodes with twoBSTany node one with BST a ][ > = = ⎪⎩ ⎪⎨ ⎧ = h h h htree Hence that )1(1]2[]1[][ >+−+−= hhfhfhf is obtained by summing up two upper aspects. 6.1.2 The Worst Height In fact is exponential function. Its precise value can be calculated from the recursion. ][hf ][1]2[1 5 1 5 ][ 0 333 iacciFibonhFibonaccihf h i hhh ∑ = +++ =−+=−=−−= αβα where 2 51+=α and 2 51−=β Some usual values of f[h] H 13 15 17 19 21 23 25 27 29 31 F[h] 986 2583 6764 17710 46367 121392 317810 832039 2178308 5702886 Lemma The worst height of an n-node SBT is the maximum h subjected to . nhf ≤][ Assume that maxh is the worst height of a SBT at the size of n. According to the lemma above, we have 33.1log44.1maxh 3logmaxh 5.1 5 1 5 ]maxh[ 1.5n 2 )5.1(5 3maxh 3maxh −≤ ⇒−≤ ⇒+≤ ⇒≤−= + + + + n n nf α α α Now it is clear that the height of a SBT is O(logn). 6.2 Analysis Of Maintain We can easily prove that Maintain works quite efficiently using the previous conclusions. There is a very important value to estimate how well a BST is, the average of all nodes’ depths. It is the quotient obtained by SD, the sum of the depths of all nodes, dividing n. Generally the less it is, the better a BST is. Because of the constant n, SD is expected to be as small as possible. Now we need to concentrate on SD. Its significance is the ability to restrict the times of Maintain. Recalling the conditions to perform rotations in Maintain it is surprising that SD always decreases after rotations. In case 1, for example, comparing Figure 2.1 with Figure 3.1, SD increases a negative value, s[right[t]]-s[left[left[t]]. And for instance comparing Figure 3.2 to Figure 3.4, SD increases a value less than -1, s[right[t]]-s[right[left[t]]]-1. Due to the height of O(logn) SD is always kept O(nlogn). And SD just increases O(logn) after an insertion on a SBT. Therefore )log( )(log)(log nnOT nOTnOnSD = ⇒=−×= where T is the number of Maintains running rotations. The total number of Maintains is T plus the number of Maintains without rotations. Since the latter is O(nlogn)+O(T), the amortized runtime for Maintain is )1( log )()log()( O nn TOnnOTO =++ 6.3 Analysis Of Each Operation Now that the height of SBT is O(logn) and Maintain is O(1), the runtime for all primary operations are O(logn). 6.4 Analysis Of Faster And Simpler Delete We call the statement P(n*) that a BST with the faster and simpler Delete and n* standard Inserts is at the height of O(logn*). We prove P(n*) is true for arbitrary positive integer n* by mathematical induction. 6.4.1 Proof Here I just give a rough proof. Assume that a node t is checked by Maintain(t,false), we have 3 1][2]][[ ]][[ 2 1]][[ −≤ ⇒≤− tstlefts trightstlefts Therefore if all nodes on the path from a node to the root are checked by Maintain the depth of this node is O(logn). (1) For n*=1, P(n*) is true apparently; (2) Assume that P(n*) is true for n*
本文档为【10.陈启峰《Size Balanced Tree》】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_720256
暂无简介~
格式:pdf
大小:225KB
软件:PDF阅读器
页数:18
分类:互联网
上传时间:2010-11-15
浏览量:9