`
sunwinner
  • 浏览: 203286 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

基础数据结构和算法十:2-3 search tree

阅读更多

 

Binary search tree works well for a wide variety of applications, but they have poor worst-case performance. Now we introduce a type of binary search tree where costs are guaranteed to be logarithmic, no matter what sequence of keys is used to construct them. Ideally, we would like to keep our binary search trees perfectly balanced. In an N-node tree, we would like the height to be ~lg N so that we can guarantee that all searches can be completed in ~lg N compares, just as for binary search. Unfortunately, maintaining perfect balance for dynamic insertions is too expensive. In this section, we consider a data structure that slightly relaxes the perfect balance requirement to provide guaranteed logarithmic performance not just for the insert and search operations in our symbol-table API but also for all of the ordered operations (except range search).

 

 

The primary step to get the flexibility that we need to guarantee balance in search trees is to allow the nodes in our trees to hold more than one key. Specifically, referring to the nodes in a standard BST as 2-nodes (they hold two links and one key), we now also allow 3-nodes, which hold three links and two keys. Both 2-nodes and 3-nodes have one link for each of the intervals subtended by its keys. A 2-3 search tree is a tree that is either empty or

■ A 2-node, with one key (and associated value) and two links, a left link to a 2-3 search tree with smaller keys, and a right link to a 2-3 search tree with larger keys

■ A 3-node, with two keys (and associated values) and three links, a left link to a 2-3 search tree with smaller keys, a middle link to a 2-3 search tree with keys between the node’s keys, and a right link to a 2-3 search tree with larger keys

 

As usual, we refer to a link to an empty tree as a null link. A perfectly balanced 2-3 search tree is one whose null links are all the same distance from the root.

 

 

Search. The search algorithm for keys in a 2-3 tree directly generalizes the search al- gorithm for BSTs. To determine whether a key is in the tree, we compare it against the keys at the root. If it is equal to any of them, we have a search hit; otherwise, we follow the link from the root to the subtree corresponding to the interval of key values that could contain the search key. If that link is null, we have a search miss; otherwise we recursively search in that subtree.

 

 

Insert into a 2-node. To insert a new node in a 2-3 tree, we might do an unsuccessful search and then hook on the node at the bottom, as we did with BSTs, but the new tree would not remain perfectly balanced. The primary reason that 2-3 trees are useful is that we can do insertions and still main- tain perfect balance. It is easy to accomplish this task if the node at which the search terminates is a 2-node: we just replace the node with a 3-node containing its key and the new key to be inserted. If the node where the search terminates is a 3-node, we have more work to do.

 

Insert into a tree consisting of a single 3-node. As a first warmup before considering the general case, suppose that we want to insert into a tiny 2-3 tree consisting of just a single 3-node. Such a tree has two keys, but no room for a new key in its one node. To be able to perform the insertion, we temporarily put the new key into a 4-node, a natural extension of our node type that has three keys and four links. Creating the 4-node is convenient because it is easy to convert it into a 2-3 tree made up of three 2-nodes, one with the middle key (at the root), one with the smallest of the three keys (pointed to by the left link of the root), and one with the largest of the three keys (pointed to by the right link of the root). Such a tree is a 3-node BST and also a perfectly balanced 2-3 search tree, with all the null links at the same distance from the root. Before the insertion, the height of the tree is 0; after the insertion, the height of the tree is 1. This case is simple, but it is worth considering because it illustrates height growth in 2-3 trees.

 

 

Insert into a 3-node whose parent is a 2-node. As a second warmup,suppose that the search ends at a 3-node at the bottom whose parent is a 2-node. In this case, we can still make room for the new key while maintaining perfect balance in the tree, by making a temporary 4-node as just described, then splitting the 4-node as just described, but then, instead of creating a new node to hold the middle key, moving the middle key to the node’s parent. You can think of the transformation as replacing the link to the old 3-node in the parent by the middle key with links on either side to the new 2-nodes. By our assumption, there is room for doing so in the parent: the parent was a 2-node (with one key and two links) and becomes a 3-node (with two keys and three links). Also, this transformation does not affect the defining properties of (perfectly balanced) 2-3 trees. The tree remains or- dered because the middle key is moved to the parent, and it remains perfectly balanced: if all null links are the same distance from the root before the insertion, they are all the same distance from the root after the insertion. Be certain that you understand this transformation—it is the crux of 2-3 tree dynamics.

 

 

Insert into a 3-node whose parent is a 3-node. Now suppose that the search ends at a node whose parent is a 3-node. Again, we make a temporary 4-node as just described, then split it and insert its middle key into the parent. The parent was a 3-node, so we replace it with a temporary new 4-node containing the middle key from the 4-node split. Then, we perform precisely the same transformation on that node. That is, we split the new 4-node and insert its middle key into its parent. Extending to the general case is clear: we continue up the tree, splitting 4-nodes and inserting their middle keys in their parents until reaching a 2-node, which we replace with a 3-node that does not need to be further split, or until reaching a 3-node at the root.

 

 

Splitting the root. If we have 3-nodes along the whole path from the insertion point to the root, we end up with a temporary 4-node at the root. In this case we can proceed in precisely the same way as for insertion into a tree consisting of a single 3-node. We split the temporary 4-node into three 2-nodes, increasing the height of the tree by 1. Note that this last transformation preserves perfect balance because it is performed at the root.

 

 

Local transformations. Splitting a temporary 4-node in a 2-3 tree involves one of six transformations, summarized at the bottom of the next page. The 4-node may be the root; it may be the left child or the right child of a 2-node; or it may be the left child, middle child, or right child of a 3-node. The basis of the 2-3 tree insertion algorithm is that all of these transformations are purely local: no part of the tree needs to be examined or modified other than the specified nodes and links. The number of links changed for each transformation is bounded by a small constant. In particular, the transformations are effective when we find the specified patterns anywhere in the tree, not just at the bottom. Each of the transformations passes up one of the keys from a 4-node to that node’s parent in the tree and then restructures links accordingly, without touching any other part of the tree.

 

Global properties. Moreover, these local transformations preserve the global properties that the tree is ordered and perfectly balanced: the number of links on the path from the root to any null link is the same. For reference, a complete diagram illustrating this point for the case that the 4-node is the middle child of a 3-node is shown above. If the length of every path from a root to a null link is h before the transformation, then it is h after the transformation. Each transformation preserves this property, even while splitting the 4-node into two 2-nodes and while changing the parent from a 2-node to a 3-node or from a 3-node into a temporary 4-node. When the root splits into three 2-nodes, the length of every path from the root to a null link increases by 1. Another important properties of 2-3 search tree is, unlike standard BSTs, which grow down from the top, 2-3 trees grow up from the bottom. Search and insert operations in a 2-3 tree with N keys are guaranteed to visit at most lg N nodes.

 

 

Thus, we can guarantee good worst-case performance with 2-3 trees. The amount of time required at each node by each of the operations is bounded by a constant, and both operations examine nodes on just one path, so the total cost of any search or insert is guaranteed to be logarithmic.

 

 

However, we are only part of the way to an implementation. Although it is possible to write code that performs transformations on distinct data types representing 2- and 3-nodes, most of the tasks that we have described are inconvenient to implement in this direct representation because there are numerous different cases to be handled. We would need to maintain two different types of nodes, compare search keys against each of the keys in the nodes, copy links and other information from one type of node to another, convert nodes from one type to another, and so forth. Not only is there a substantial amount of code involved, but the overhead incurred could make the algorithms slower than standard BST search and insert. The primary purpose of balancing is to provide insurance against a bad worst case, but we would prefer the overhead cost for that insurance to be low. Fortunately, as you will see, we can do the transformations in a uniform way using little overhead with Red-Black tree.

分享到:
评论

相关推荐

    前端开源库-binary-search-tree

    二进制搜索树(Binary Search Tree,简称BST)是一种数据结构,它在计算机科学中扮演着重要的角色,尤其是在数据存储和检索方面。这种结构是基于排序的,每个节点包含一个键(key)、一个关联的值、一个指向左子树的...

    some-of-the-binary-tree-algorithm..zip_The Tree

    **二叉树算法基础与经典实现:深入理解与代码...无论是初级开发者还是高级工程师,深入理解二叉树算法都是构建高效数据结构和算法库的关键一步。在分析和调试过程中,这些经典算法的理解和实践将极大地提升编程能力。

    基于R-Tree的DBSCAN算法的改进(Java版)

    2. 邻域查询优化:使用R-Tree的查询接口,如`search`方法,来查找给定点的邻域,而不是遍历整个数据集。 3. DBSCAN核心逻辑:根据R-Tree的查询结果,判断每个点的邻居数量,识别核心点、边界点和噪声点,逐步扩展...

    leetcode-tag-Tree

    "leetcode-tag-Tree" 指的是 LeetCode 上与“树”相关的标签,这通常意味着这些题目涉及到数据结构中的树型结构。树是一种非线性的数据结构,由若干个节点(或称为顶点)和连接这些节点的边组成,每个节点都可能有零...

    java数据结构和算法学习笔记

    8. **2-3-4树** - **优点**:与红黑树类似,始终保持树的平衡,适合用于磁盘存储。 - **缺点**:实现复杂。 9. **哈希表(Hash Table)** - **优点**:当已知关键字时,存取速度极快;插入操作速度快。 - **缺点**...

    算法分析和数据结构 Python描述版

    - **强大的内置数据结构**:Python提供了一系列内置的数据结构,如列表(list)、元组(tuple)、字典(dict)等,这些数据结构可以直接用于实现复杂的算法和数据结构。 2. **Python在算法实现中的优势**: - **代码...

    微软等公司数据结构+算法面试第1-100题汇总(前20题答案)

    数据结构是计算机科学的基础之一,它涉及到如何组织和存储数据,以便能够高效地访问和修改它们。而算法则是解决特定问题的一系列步骤或规则。掌握这两方面的知识不仅有助于解决实际工作中遇到的问题,也是通过大厂...

    数据结构与算法教程(C++版)实验和课程设计

    - **性能评估**:对不同的数据结构和算法进行时间和空间复杂度的分析,评估其效率。 - **团队合作**:在项目开发过程中,培养团队协作能力,共同完成一个大型项目的设计与实现。 通过本教程的学习,不仅能够掌握...

    算法与数据结构-停车场问题程序

    这可能涉及更复杂的数据结构,如散列表(Hash Table)或树形结构(如二叉搜索树Binary Search Tree),以便于快速检索和更新。 #### 5. **时间复杂度分析** 在设计算法时,考虑时间复杂度是必要的。对于栈和队列...

    数据结构与算法分析-C++描述

    《数据结构与算法分析-C++描述》是一本深入讲解数据结构和算法的权威教材,尤其适合于使用C++编程语言的计算机科学专业学生和技术人员。本书由Mark Allen Weiss撰写,第三版更是对前两版进行了全面的更新和完善,...

    浙江大学2018-19秋冬《数据结构基础》期末模拟练习1

    数据结构基础是计算机科学中...总结这些知识点,数据结构基础课程要求学生理解和掌握各种数据结构(如链表、树、图和栈),以及相关的操作算法(如排序、搜索和路径查找)。这些知识对于理解和优化程序的性能至关重要。

    search-the-tree.zip_The Tree

    在IT领域,数据结构与算法是编程的基础,而树作为一种重要的数据结构,广泛应用于各种问题的解决中。这里我们关注的是“搜索树”的概念,它通常指的是二叉搜索树(Binary Search Tree,简称BST),是一种特殊的...

    数据结构与算法分析C++语言描述

    标题“数据结构与算法分析C++语言描述”表明了这本书的主要内容是关于数据结构和算法的,并且使用C++编程语言进行实现和描述。 #### 描述解析 描述部分简短地提到了这本书为扫描版,希望对读者有所帮助。这暗示了...

    Optimal-binary-search-tree-algorithm.rar_tree

    最优二叉搜索树(Optimal Binary Search Tree, OBST)是一种在二叉搜索树的基础上,通过优化节点的排列顺序以达到高效查找性能的数据结构。它不仅保持了二叉搜索树的特性,即左子节点的值小于根节点,右子节点的值...

    数据结构与算法题解

    ### 数据结构与算法题解概览 #### 一、基础知识篇 **1. 基础数据结构** ...以上涵盖了从基础数据结构和算法到具体的编程实现细节,这些知识点不仅适用于初学者,也适合有一定基础的学习者深入理解和应用。

    java-二叉树binaryTree

    在IT领域,二叉树(Binary Tree)是一种基础的数据结构,尤其在计算机科学中有着广泛的应用。二叉树是每个节点最多有两个子节点的树结构,通常分为左子节点和右子节点。在这个"java-二叉树binaryTree"主题中,我们将...

    数据结构通用树tree的实现-可用demo

    在计算机科学中,数据结构是组织、存储和处理数据的方式,它是编程的基础。其中,树是一种非线性数据结构,模拟了自然界中的层次关系,它由节点(也称为顶点)和边(连接节点的关系)组成。在这个"数据结构通用树...

Global site tag (gtag.js) - Google Analytics