`
leonzhx
  • 浏览: 799642 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  String symbol table implementations cost summary (in terms of how many characters are compared)



 

2.  Tries [from retrieval, but pronounced "try"]

    --  Store characters in nodes (not keys).

    --  Each node has R children, one for each possible character.

    --  For now, we do not draw null links.

    --  Search in a trie: Follow links corresponding to each character in the key.

        -  Search hit: node where search ends has a non-null value.

        -  Search miss: reach null link or node where search ends has null value.

    --  Insertion into a trie: Follow links corresponding to each character in the key.

        -  Encounter a null link: create new node.

        -  Encounter the last character of the key: set value in that node.

    --  To delete a key-value pair:

        -  Find the node corresponding to key and set value to null.

        -  If node has null value and all null links, remove that node (and recur).

 

 

    --  Java implementation:

 

public class TrieST<Value>
{
    private static final int R = 256;
    private Node root = new Node();

    private static class Node
    {
        private Object value;
        private Node[] next = new Node[R];
    }
    
    public void put(String key, Value val)
    { root = put(root, key, val, 0); }

    private Node put(Node x, String key, Value val, int d)
    {
        if (x == null) x = new Node();
        if (d == key.length()) { x.val = val; return x; }
        char c = key.charAt(d);
        x.next[c] = put(x.next[c], key, val, d+1);
        return x;
    }

    public boolean contains(String key)
    { return get(key) != null; }

    public Value get(String key)
    {
        Node x = get(root, key, 0);
        if (x == null) return null;
        return (Value) x.val;
    }

    private Node get(Node x, String key, int d)
    {
        if (x == null) return null;
        if (d == key.length()) return x;
        char c = key.charAt(d);
        return get(x.next[c], key, d+1);
    }

    public void delete(String key) {
        delete(root, key, 0);
    }

    private Value delete(Node x , String key, int d) {
        if ( x == null) return null;
        if ( d == key.length() ) {
            x.val = null;
        } else {
            char c = key.charAt(d);
            x.next[c] = delete(x.next[c] , key, d+1);
        }
        if ( empty(x) ) x = null;
        return x;
    }

    private boolean empty(Node x) {
        if ( x.val != null ) return false;
        for ( int i = 0 ; i < x.next.length ; i ++ ) {
            if ( x.next[i] != null ) return false;
        }
        return true;
    } 
}

 

 

 

    --  Performance:

        --  Search hit. Need to examine all L characters for equality.

        --  Search miss.

            -  Could have mismatch on first character.

            -  Typical case: examine only a few characters (sublinear).

        --  Space. R null links at each leaf.

            (but sublinear space possible if many short strings share common prefixes)

 

3.  Ternary search tries

    --  Store characters and values in nodes (not keys).

    --  Each node has 3 children: smaller (left), equal (middle), larger (right).


    --  Search in a TST: Follow links corresponding to each character in the key.

        --  If less, take left link; if greater, take right link.

        --  If equal, take the middle link and move to the next key character.

        --  Search hit. Node where search ends has a non-null value.

        --  Search miss. Reach a null link or node where search ends has null value.

    --  Java Implementation:

 

public class TST<Value>
{
    private Node root;

    private class Node
    {
        private Value val;
        private char c;
        private Node left, mid, right;
    }

    public void put(String key, Value val)
    { root = put(root, key, val, 0); }

    private Node put(Node x, String key, Value val, int d)
    {
        char c = key.charAt(d);
        if (x == null) { x = new Node(); x.c = c; }
        if (c < x.c) x.left = put(x.left, key, val, d);
        else if (c > x.c) x.right = put(x.right, key, val, d);
        else if (d < key.length() - 1) x.mid = put(x.mid, key, val, d+1);
        else x.val = val;
        return x;
    }

    public boolean contains(String key)
    { return get(key) != null; }

    public Value get(String key)
    {
        Node x = get(root, key, 0);
        if (x == null) return null;
        return x.val;
    }

    private Node get(Node x, String key, int d)
    {
        if (x == null) return null;
        char c = key.charAt(d);
        if (c < x.c) return get(x.left, key, d);
        else if (c > x.c) return get(x.right, key, d);
        else if (d < key.length() - 1) return get(x.mid, key, d+1);
        else return x;
    }
}

 

 

4.  TST with R^2 branching at root: Hybrid of R-way trie and TST.

    --  Do R^2-way branching at root.

    --  Each of R^2 root nodes points to a TST.

 

 

5.  String symbol table implementation cost summary:



 

6.  TST vs. hashing

    --  Hashing:

        --  Need to examine entire key.

        --  Search hits and misses cost about the same.

        --  Performance relies on hash function.

        --  Does not support ordered symbol table operations.

    --  TSTs:

        --  Works only for strings (or digital keys).

        --  Only examines just enough key characters.

        --  Search miss may involve only a few characters.

        --  Supports ordered symbol table operations (plus others!).

    --  Bottom line. TSTs are:

        --  Faster than hashing (especially for search misses).

        --  More flexible than red-black BSTs. 

 

7. Ordered iteration: To iterate through all keys in sorted order:

    --  Do inorder traversal of trie; add keys encountered to a queue.

    --  Maintain sequence of characters on path from root to node.

public Iterable<String> keys()
{
    Queue<String> queue = new Queue<String>();
    collect(root, "", queue);
    return queue;
}

private void collect(Node x, String prefix, Queue<String> q)
{
    if (x == null) return;
    if (x.val != null) q.enqueue(prefix);
    for (char c = 0; c < R; c++)
        collect(x.next[c], prefix + c, q);
}

 

8.  Character-based operations:

    --  Prefix match: Find all keys in a symbol table starting with a given prefix.

        --  Ex. Keys with prefix sh: she, shells, and shore.

        --  application:

            - Autocomplete in a cell phone, search bar, text editor, or shell.

            - User types characters one at a time.

            - System reports all matching strings.

public Iterable<String> keysWithPrefix(String prefix)
{
    Queue<String> queue = new Queue<String>();
    Node x = get(root, prefix, 0);
    collect(x, prefix, queue);
    return queue;
}

    --  Wildcard match. Keys that match .he: she and the.

    --  Longest prefix: Find longest key in symbol table that is a prefix of query string.

        --  Ex. Key that is the longest prefix of shellsort: shells.

public String longestPrefixOf(String query)
{
    int length = search(root, query, 0, 0);
    return query.substring(0, length);
}

private int search(Node x, String query, int d, int length)
{
    if (x == null) return length;
    if (x.val != null) length = d;
    if (d == query.length()) return length;
        char c = query.charAt(d);
    return search(x.next[c], query, d+1, length);
}

 

9.  String symbol tables summary:

    --  Red-black BST.

        -  Performance guarantee: log N key compares.

        -  Supports ordered symbol table API.

    --  Hash tables.

        -  Performance guarantee: constant number of probes.

        -  Requires good hash function for key type.

    --  Tries. R-way, TST.

        -  Performance guarantee: log N characters accessed.

        -  Supports character-based operations.

  • 大小: 35.3 KB
  • 大小: 37 KB
  • 大小: 30.3 KB
  • 大小: 29.6 KB
  • 大小: 26.1 KB
  • 大小: 39.7 KB
分享到:
评论

相关推荐

    动态的路径压缩字典树(Dynamic Path-Decomposed Tries)

    动态的路径压缩字典树(Dynamic Path-Decomposed Tries),是一种高效且节省存储空间的字符串数据结构,尤其适用于处理大量字符串数据。这种结构在内存中实现,旨在提供快速的字符串查询和操作,同时减少存储需求。...

    SuRF: Practical Range Query Filtering with Fast Succinct Tries原文

    《SuRF: 实践性范围查询过滤与快速简洁的Trie》是关于一种新的数据结构SuRF(Succinct Range Filter)的研究论文。SuRF旨在解决传统Bloom Filter无法同时高效处理单键查找和范围查询的问题。文章由来自卡内基梅隆...

    leetcode2-Tries-2:尝试2

    在这个名为"leetcode2-Tries-2:尝试2"的主题中,我们将深入探讨三个与Trie相关的LeetCode问题。这些问题涉及到不同的应用场景,包括处理单词列表、匹配特定模式以及找出数据集中最常出现的元素。 问题1:单词方块...

    Python-字典数据结构的基准测试hashtablesmapstries等

    本话题主要聚焦于字典数据结构的基准测试,涉及了哈希表(Hash Tables)、映射(Maps)以及字典树(Tries)等概念。这些数据结构在不同的场景下有着各自的性能优势,了解它们的特性和比较有助于优化程序性能。 首先...

    Tries and Suffix Tries - Slides (Ben Langmead, Johns Hopkins)-计算机科学

    Tries and suffix tries Ben LangmeadYou are free to use these slides. If you do, please sign the guestbook (www.langmead-lab.org/teaching-materials), or email me (ben.langmead@gmail.com) and tell me ...

    tries

    在IT领域,"tries"通常指的是数据结构中的“字典树”或“Trie树”,这是一种用于高效存储和检索字符串的数据结构。字典树的名字来源于英文单词“re trie ve”的复合,意为“检索”。在Trie树中,每个节点代表一个...

    Tries:自动完成功能和AlphabetSort

    我们主要参考的资源是名为"Tries-master"的压缩包文件,虽然具体内容无法直接查看,但我们可以根据标题和描述来解释相关的编程概念。 **自动完成功能 (Autocomplete Functionality)** 自动完成功能广泛应用于各种...

    tries.zip_Windows编程_Java_

    "tries.zip"这个文件可能包含了一系列关于如何在Windows环境下使用Java进行编程的示例、教程或代码片段。"try and try i hope after several tries i ve get it"这句话描述了编程过程中的常见经历——通过不断的尝试...

    redis-tries:Redis的trys模块

    Redis尝试欢迎使用Redis Tries模块存储库! 该存储库的目的是在Redis中添加并实现对前缀尝试的支持。 什么是尝试? 特里树(也称为数字树,基数树或前缀树)是一种高效的搜索树,其中的键是字符串。 数据结构的名称...

    EIT - The Internal Extent Formula for Compacted Tries-计算机科学

    E D I C TThe internal ...generalization of the formula holds for compacted tries, replacing the role of paths with the notion of extent, and the value 2n�2with the trie measure, an estimation of the

    Tries and String Matching - Slides - 2009 (Small09)-计算机科学

    Tries and String MatchingWhere We've Been● Fundamental Data Structures ● Red/black trees, B-trees, RMQ, etc.● Isometries ● Red/black trees ≡ 2-3-4 trees, binomialheaps ≡ binary numbers, etc.● ...

    leetcode-Tries-1:试一试

    6. **文件“Tries-1-master”** 这个文件名可能是某个GitHub仓库的克隆,通常包含一个有关Trie的项目源码。在这个项目中,你可能能够找到问题1至问题3的解决方案,以及关于如何实现和使用Trie的更多示例。通过阅读...

    Image_Processing_Tries:尝试实现一些图像滤镜和图像处理算法

    在本项目"Image_Processing_Tries"中,开发者尝试实现了一系列图像滤镜和图像处理算法,主要使用了Java编程语言。这个项目对于学习和理解图像处理技术,以及Java在该领域的应用具有很好的参考价值。以下将详细介绍...

    微软内部资料-SQL性能优化3

    Contents Overview 1 Lesson 1: Concepts – Locks and Lock Manager 3 Lesson 2: Concepts – Batch and Transaction 31 Lesson 3: Concepts – Locks and Applications 51 Lesson 4: Information Collection and ...

    merkle-prefix-trie:使用Merkle前缀尝试(MPT)的经过身份验证的词典和经过身份验证的集的Java实现

    这是使用Merkle Prefix Tries的Authenticated Dictionary (一组键-值映射)和Authenticated Sets (一组值)的完整实现。 此数据结构允许非常小的(对数)成员资格和非成员资格证明。 这些数据结构还支持在动态...

    LWIP之opt.h配置含义

    DOES_ARP_CHECK`, `LWIP_AUTOIP`, `LWIP_DHCP_AUTOIP_COOP`, `WIP_DHCP_AUTOIP_COOP_TRIES`, `LWIP_SNMP`, `SNMP_CONCURRENT_REQUESTS`, `SNMP_TRAP_DESTINATIONS`, `SNMP_PRIVATE_MIB`, `SNMP_SAFE_REQUESTS`, `...

    一个简单的猜数游戏的Python程序.pdf

    while tries &lt; max_tries: ... tries += 1 if tries == max_tries: print("很遗憾,你没有猜对。正确数字是", number) ``` 2. **记录玩家成绩**:可以存储玩家的历史记录,比如最少猜了多少次猜对,或者统计...

    trie.js:特里.js

    #Tries — javascript 简单实现##什么是特里? 特里是一棵树。 它是一个 n 叉树,专为高效检索而设计。 效率有多高? trie 允许我们在 O(m) 中搜索字符串,其中 m 是该字符串中的字符数。 其他数据结构表现更好吗? ...

    Android代码-TilelessMap

    This project tries to target the vacuum behind the enormous evolution of online mapping the last decade. There are situations when keeping a lot of structured map data in a device for the field is ...

    gwget-1.0.1.tar

    Resume: By default, gwget tries to continue any download. Notification: Gwget tries to use the Gnome notification area support, if available. You can close the main window and gwget runs in the ...

Global site tag (gtag.js) - Google Analytics