`
leonzhx
  • 浏览: 793716 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

Hash Tables and Application

阅读更多

 

1.  Purpose of Hash Tables : maintain a (possibly evolving) set of stuff.

    Supported Operations:

    a)  Insert: add new record

    b)  Delete: delete existing record

    c)  Lookup: check for a particular record

    All operations in O(1) time

 

2.  Application: De-Duplication

    Given: a “stream” of objects. (Linear scan of a huge file or objects arriving in real time)

    Goal: remove duplicates

    Solution: when new object x arrives -­- lookup x in hash table H if not found, Insert x into H

 

3.  Application: The 2-SUM Problem

    Input: unsorted array A of n integers. Target sum t.

    Goal: determine whether or not there are two numbers x,y in A with x + y = t

    Naive Solution: O(n^2) time via exhaustive search

    Better Solution: 1) sort A ( O(nlogn) time) 

                              2) for each x in A, look for t-x in A via binary search (O(nlogn) time) 

    Amazing: 1) insert all elements of A to hash table H (O(n) time) 

                   2) for each x in A, Lookup t-x in H ( O(n) time)

 

4.  High-Level Idea of Implementation:

  Setup: universe U (generally, really big)

  Goal: want to maintain evolving set S subset of U (generally, of reasonable size)

  Solution: 1) pick n = # of “buckets” with n ~ |S| (for simplicity assume |S| doesn’t vary much)

                 2) choose a hash function

                 3) use array A of length n, store x in A[h(x)]

 

5.  Resolving Collisions

   Collision: distinct x, y in U such that h(x) = h(y)

   Solution #1 : (separate) chaining 

            - keep linked list in each bucket

            - ­given a key/object x, perform Insert/Delete/Lookup in the list in A[h(x)]

   Solution #2 : open addressing (only one object per bucket)

            -­ Hash function now specifies probe sequence h1(x),h2(x),.. (keep trying till find open slot)

            -­ Examples : linear probing (look consecutively), double hashing

 

6.  Properties of a “Good” Hash function

  1) Should lead to good performance => i.e., should “spread data out” 

      (gold standard – completely random hashing)

  2) Should be easy to store/ very fast to evaluate.

 

7.  Quick-­and-Dirty Hash Functions:

     Object U --> (hash code) Integers --> (compression function like mod n function) buckets {0, 1, ..., n-1}

     How to choose n = # of buckets:

     a) Choose n to be a prime ( within constant factor of # of objects in table)

     b) Not too close to a power of 2

     c) Not too close to a power of 10

 

分享到:
评论

相关推荐

    Disk Based Hash Tables and Quantified Numbers (24th March, 2014)-计算机科学

    Disk Based Hash Tables and Quantified NumbersEdscott Wilson GarciaMarch 24, 2014AbstractThis paper describes the design of Disk Based Hash Tables: n–dimen- sional self–indexed data tables. The ...

    hash table spell checking

    Goals: This assignment is designed to reinforce the student's understanding of the use of hash tables as searchable containers. Outcomes: Students successfully completing this assignment would master...

    数据结构作业Hash表

    Goals: This assignment is designed to reinforce the student's understanding of the use of hash tables as searchable containers. Outcomes: Students successfully completing this assignment would master...

    C++ Data Structures and Algorithms

    Build enhanced applications by using hashtables, dictionaries, and sets Implement searching algorithms such as linear search, binary search, jump search, exponential search, and more Have a positive ...

    微软内部资料-SQL性能优化3

    Contents Overview 1 Lesson 1: Concepts – Locks and Lock Manager 3 Lesson 2: Concepts – Batch and Transaction 31 Lesson 3: Concepts – Locks and...The hash value consists of all the key components and ...

    ARM® Compiler v5.06 for µVision® armasm User Guide

    4.26 Exception tables and Unwind tables 4.27 Assembly language changes after RVCT v2.1 5 Condition Codes 5.1 Conditional instructions 5.2 Conditional execution in ARM state 5.3 Conditional execution ...

    tcl and the tk toolkit PART3

    然后讨论了如何创建和删除哈希表(Creating and deleting hash tables)。接下来,介绍了创建条目(Creating entries)的方法以及如何查找现有条目(Finding existing entries)。此外,还讨论了搜索(Searching)技术以及...

    Clever Internet Suite (SRC) v9.1.0.0

    All these components make the application development process easy and clean. You can use these components separately from the main protocol components with any other library and even with your own ...

    python3.6.5参考手册 chm

    PEP 456: Secure and Interchangeable Hash Algorithm PEP 436: Argument Clinic Other Build and C API Changes Other Improvements Significant Optimizations Deprecated Deprecations in the Python API ...

    BobBuilder_app

    In this version I have done away with the b+tree and hash index in favor of my own MGIndex structure which for all intents and purposes is superior and the performance numbers speak for themselves....

    pjsip开发者指南中文版-全章节(1-16章).docx

    PJSIP使用PJLIB’s数据结构,例如lists、arrays、hash tables、strings和memory pools。这些结构都不是线程安全的,它们的安全性是由包含它们的对象来确定的。如果数据结构是线程安全的,那么会对协议栈的性能产生...

    sqlmap (懂的入)

    exploiting and query syntax obviously depend upon the web application database management system back-end; * It recognizes valid queries by false ones based upon HTML output page hashes comparison ...

    Bochs - The cross platform IA-32 (x86) emulator

    - Bochs now can be compiled as native Windows x86-64 application (tested with Mingw gcc 4.5.1 and Microsoft Visual Studio Express 2010) - Added ability to configure CPUID stepping through .bochsrc....

    OCP.Oracle.11g.New.Features.for.Administrators.Exam.Guide.

    - **List and Hash Partitioning Enhancements**:提高了列表和哈希分区的效率和灵活性。 10. **DML Enhancements**: - **Row Movement**:允许在分区之间移动行,简化了分区维护。 - **SQL*Loader ...

    PHP内核扩展教程及文档_ppt_code等

    1. **哈希表(Hashtables)**:PHP内核中广泛使用哈希表存储数组、对象和变量。这是一种高效的数据结构,通过键值对进行查找和存储。哈希表的实现涉及到哈希函数、冲突解决策略以及内存管理。 2. **SAPI(Server ...

    Java语言常用的方法名.pdf

    “Hashtables”和“Vectors”分别是指哈希表和向量,它们都是集合框架的一部分。 “Collectioninterface”指的是集合接口,在Java集合框架中,定义了集合操作的通用行为。 “Constructor”是指构造器,用于创建和...

    oracle学习文档 笔记 全面 深刻 详细 通俗易懂 doc word格式 清晰 连接字符串

     删除HKEY_LOCAL_MACHINE/SYSETM/CurrentControlSet/Services/Eventlog/application中所有以oracle开头的键。  删除HKEY_CLASSES_ROOT目录下所有以Ora、Oracle、Orcl或EnumOra为前缀的键。  删除HKEY_CURRENT_...

    Oracle运维最佳实践-下.pdf 带书签

    - 使用`USE_HASH`、`INDEX`等提示来指导优化器选择特定类型的执行计划。 - 通过`DBMS_STATS.SET_TABLE_PREFS`设置特定表的统计偏好,影响优化器决策。 - **2.1.5 可以nologging执行的操作** - 在某些情况下,...

Global site tag (gtag.js) - Google Analytics