赫夫曼編碼演算法
赫夫曼編碼是一種無損資料壓縮演算法。在此演算法中,可變長編碼被分配給輸入的不同字元。編碼的長度與字元的使用頻率相關。最頻繁的字元具有最短的編碼,而頻率最低的字元具有最長的編碼。
主要有以下兩個部分。第一個是建立赫夫曼樹,另一個是遍歷樹以找到編碼。
例如,考慮一些字串“YYYZXXYYX”,字元 Y 的頻率大於 X,而字元 Z 的頻率最低。因此,Y 的編碼長度小於 X,而 X 的編碼將小於 Z。
根據各自的頻率為每個字元分配編碼的時間複雜度為 O(n log n)
輸入和輸出
Input: A string with different characters, say “ACCEBFFFFAAXXBLKE” Output: Code for different characters: Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
演算法
huffmanCoding(string)
輸入:一個包含不同字元的字串。
輸出:每個字元的編碼。
Begin define a node with character, frequency, left and right child of the node for Huffman tree. create a list ‘freq’ to store frequency of each character, initially, all are 0 for each character c in the string do increase the frequency for character ch in freq list. done for all type of character ch do if the frequency of ch is non zero then add ch and its frequency as a node of priority queue Q. done while Q is not empty do remove item from Q and assign it to left child of node remove item from Q and assign to the right child of node traverse the node to find the assigned code done End
traverseNode(n: node, code)
輸入:赫夫曼樹的節點 n,以及從上一次呼叫中分配的編碼
輸出:每個字元分配的編碼
if a left child of node n ≠φ then traverseNode(leftChild(n), code+’0’) //traverse through the left child traverseNode(rightChild(n), code+’1’) //traverse through the right child else display the character and data of current node.
示例
#include #include#include using namespace std; struct node { int freq; char data; const node *child0, *child1; node(char d, int f = -1) { //assign values in the node data = d; freq = f; child0 = NULL; child1 = NULL; } node(const node *c0, const node *c1) { data = 0; freq = c0->freq + c1->freq; child0=c0; child1=c1; } bool operator<( const node &a ) const { //< operator performs to find priority in queue return freq >a.freq; } void traverse(string code = "")const { if(child0!=NULL) { child0->traverse(code+'0'); //add 0 with the code as left child child1->traverse(code+'1'); //add 1 with the code as right child }else { cout << "Data: " << data<< ", Frequency: "< qu; int frequency[256]; for(int i = 0; i<256; i++) frequency[i] = 0; //clear all frequency for(int i = 0; i1) { node *c0 = new node(qu.top()); //get left child and remove from queue qu.pop(); node *c1 = new node(qu.top()); //get right child and remove from queue qu.pop(); qu.push(node(c0, c1)); //add freq of two child and add again in the queue } cout << "The Huffman Code: "<
輸出
The Huffman Code: Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
廣告