`
leonzhx
  • 浏览: 799695 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  Binary Codes:

    -- Maps each character of an alphabet Sigma to a binary string.

    -- Example: Sigma = a-z and various punctuation (size 32 overall, say)

                       Obvious encoding: Use the 32 5-bit binary strings to encode this (a xed-length code)

 

2.  Variable Length Encoding :

    -- if some characters of Sigma are much more frequent than others, using a variable-length code.

    -- Problem: With variable-length codes, not clear where one character ends + the next one begins.

    -- Solution: Prex-free codes -- make sure that for every pair i , j in Sigma, neither of the encodings f (i) , f (j) is a prex of the other.

 

3.  Codes as Trees

    -- Useful fact: Binary codes <==> Binary trees


 

4.  Prex-Free Codes as Trees

    -- Goal: Best binary prex-free encoding for a given set of character frequencies.

    -- Left child edges <--> "0", right child edges <--> "1"

    -- For each i in Sigma , exactly one node labeled "i"

    -- Encoding of i in Sigma <--> Bits along path from node to the node "i"

    -- Prex-free <--> Labelled nodes = the leaves [since prefixes <--> one node an ancestor of another]

    -- To decode: Repeadetly follow path from root until you hit a leaf.

    -- Encoding length of i in Sigma  = depth of i in tree.

 

5.  Problem Denition

    -- Input: Probability pi for each character i in Sigma.

    -- Notation: If T = tree with leaves <--> symbols of Sigma , then average encoding length

       L(T) = Sum(i in Sigma){ pi * [depth of i in T] }

    -- Output: A binary tree T minimizing the average encoding length L(T).

 

6.  A Greedy Approach:

    -- Build tree bottom-up using successive mergers.

    -- Final encoding length of i in Sigma = # of mergers its subtree endures.

       [Each merger increases encoding length of participating symbols by 1]

    -- Greedy heuristic: In first iteration, merge the two symbols with the smallest frequencies. Replace symbols a , b by a new "meta-symbol" ab and the frequency Pab = Pa + Pb

    -- Recusively apply the approach on less characters



 

7.   Huffman's Algorithm:

    -- If |Sigma| = 2 return A --> 0, B -->1

    -- Otherwise, Let a , b in Sigma have the smallest frequencies.

        -- Let Sigma' = Sigma with a , b replaced by new symbol ab.

        -- Define Pab = Pa + Pb.

        -- Recursively compute T' (for the alphabet Sigma')

        -- Extend T' (with leaves <--> Sigma') to a tree T with leaves <--> Sigma by splitting leaf ab into two leaves a & b.

 

8.  Correctness of Huffman's Algorithm:

    -- By induction on n = |Sigma|. (Can assume n >= 2.)

    -- Base case: When n = 2, algorithm outputs the optimal tree. (Needs 1 bit per symbol)

    -- Inductive step: Fix input with n = |Sigma| > 2. By inductve hypothesis: Algorithm solves smaller subproblems (for Sigma') optimally. Let Sigma' = Sigma with a , b (symbols with smallest frequencies) replaced by meta-symbol ab. Define Pab = Pa + Pb.

    -- For every such pair T and T', L(T) - L(T') = Pa [a's depth in T] +Pb [b's depth in T] -Pab [ab's depth in T'] = Pa(d +1)+Pb(d +1)-(Pa +Pb)d = Pa + Pb, Independent of T , T'!

    -- The output of Huffman's Algorithm T for Sigma minimize L(T) for Sigma over all trees T' where a&b are siblings at the deepest level.

   -- Intuition: Can make an optimal tree better by pushing a & b as deep as possible (since a , b have smallest frequencies).

   -- By exchange argument. Let T* be any tree that minimizes L(T) for Sigma . Let x , y be siblings at the deepest level of T*. The exchange: Obtain T' from T* by swapping a <--> x, b <--> y (T' is the tree where a&b are siblings at the deepest level)

   -- L(T*) - L(T') = (Px - Pa) * [x's depth in T* - a's depth in T*] + (Py - Pb) [y's depth in T* - b's depth in T*] >= 0



 

 

9.  Running Time

    -- Naive implementation: O(n2) time, where n = |Sigma|.

    -- Speed ups:

        -- Use a heap! [to perform repeated minimum computations]

        -- Use keys = frequencies

        -- After extracting the two smallest-frequency symbols, re-Insert the new meta-symbol [new key = sum of the 2 old ones]

        => Iterative, O(n log n) implementation.

    -- Even faster: Sorting + O(n) additional work. -- manage (meta-)symbols using two queues.

  • 大小: 19.6 KB
  • 大小: 27.3 KB
  • 大小: 4.7 KB
分享到:
评论

相关推荐

    陈越、何钦铭-数据结构作业17:Huffman Codes哈夫曼编码

    As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the ...

    pta题库答案c语言之树结构9HuffmanCodes.zip

    在本压缩包“pta题库答案c语言之树结构9HuffmanCodes.zip”中,主要包含的是关于C语言编程及树结构的知识,特别是Huffman编码的相关解答。Huffman编码是一种用于数据压缩的算法,它利用了字符出现频率的不同来创建更...

    Huffman Codes (30).zip

    Huffman Codes (30)"中,可能包含了一些关于如何构建和使用哈夫曼编码的详细教程或实例,比如通过编程语言实现哈夫曼编码的代码示例,或者是对特定文本的哈夫曼编码分析。通过学习和理解这些内容,可以更深入地掌握...

    HuffmanCode

    c++实现的哈夫曼树的构造和哈夫曼编码,有详细的注释。

    Huffman Coding

    The Huffman codes will be sorted in the same manner as the one above i.e. frequency, highest to lowest. Design your program to read the input from the input file "infile.dat". Your program must ...

    HuffmanCode.java

    赫夫曼树实现文件和字符串压缩及解压源码

    c_code_哈夫曼编码的C++实现_

    std::unordered_map, std::string&gt; huffmanCodes; void generateHuffmanCodes(HuffmanNode* node, std::string code) { if (node-&gt;charValue != '\0') { huffmanCodes[node-&gt;charValue] = code; } if (node-&gt;...

    一个用huffman算法实现的 压缩源代码

    - `compress(input, huffmanCodes)`:对输入的文本进行编码,替换字符为对应的哈夫曼编码。 - `decompress(compressedData, huffmanTree)`:根据压缩后的数据和哈夫曼树结构进行解压缩。 通过理解并分析这个源代码...

    用matlab语言实现编解码

    function huffmanCodes = generateHuffmanCodes(huffmanTree) % ... 生成哈弗曼编码的代码 ... end ``` 4. **图像编码**: 使用生成的哈弗曼编码将图像像素值转换为二进制串。 ```matlab function encodedImage ...

    霍夫曼编码C++实现

    void generateCodes(HuffmanNode* root, std::string code, std::map, std::string&gt;& huffmanCodes) { if (root == nullptr) return; if (root-&gt;data != '\0') { huffmanCodes[root-&gt;data] = code; return; ...

    Java 实现哈夫曼编码(源代码)

    public static void generateCodes(Node root, String str, Map, String&gt; huffmanCodes) { if (root.left == null && root.right == null) { // 如果是叶子节点 huffmanCodes.put(root.data, str); } else { ...

    赫夫曼编码的Java实现

    void generateCodes(HuffmanNode node, String code, Map, String&gt; huffmanCodes) { if (node == null) return; if (node.ch != '\0') { huffmanCodes.put(node.ch, code); } generateCodes(node.left, code +...

    计算模型与算法技术:9-Greedy Technique.ppt

    * 最优解决方案:change making、minimum spanning tree、single-source shortest paths、simple scheduling problems和Huffman codes等。 * 近似解决方案:traveling salesman problem、knapsack problem和其他组合...

    Huffman_coding_(Java).zip_The English_huffman coding java_huffma

    Given an arbitrary set of symbols (the english alphabet ... There are other interesting properties of Huffman codes as well as some gotchas with respect to their use, so the above link is worth reading.

    哈夫曼编码系统(C语言实现)

    利用哈夫曼编码进行通信可以大大提高信道利用率,缩短信息传输时间,降低传输成本。但是,这要求在发送端通过一个编码系统对待传输数据预先编码,在接收端将传来的数据进行译码(复原)。对于双工信道(即可以双向...

    PTA(拼题A)-浙江大学中国大学mooc数据结构全AC代码与问题解析(C语言).zip

    课程回答题目集中部分问题解决方法分析02- 线性结构3 反向链表03-树2 列表叶子03-树3树遍历再次最优解法——实时输出法分析04-树6 完全二叉搜索树 完全二叉搜索树 中序遍历法05-树9 Huffman Codes 问题分析及建树...

    哈夫曼编码译码代码,有注释且能直接运行

    在"哈夫曼编码译码代码,有注释且能直接运行"这个项目中,`HuffmanCodes.java`文件很可能是包含完整实现的源代码。下面将详细解释哈夫曼编码的关键步骤和相关知识点: 1. **哈夫曼树构建**: - **构造哈夫曼树**:...

    信息论基础(英文版)

    5.7 Some Comments on Huffman Codes 5.8 Optimality of Huffman Codes 5.9 Shannon–Fano–Elias Coding 5.10 Competitive Optimality of the Shannon Code 5.11 Generation of Discrete Distributions from Fair ...

Global site tag (gtag.js) - Google Analytics