霍夫曼編碼
霍夫曼編碼是一種無損資料壓縮演算法。在此演算法中,為輸入的不同字元分配可變長度的程式碼。程式碼長度與字元的使用頻率相關。最常出現的字元具有最短的程式碼,而最不常出現的字元則具有更長的程式碼。
主要有兩個部分。第一個是建立霍夫曼樹,另一個是遍歷樹以查詢程式碼。
例如,考慮一些字串“YYYZXXYYX”,字元 Y 的頻率大於 X,而字元 Z 的頻率最小。因此,Y 的程式碼長度小於 X,而 X 的程式碼將小於 Z。
根據字元頻率分配每個字元程式碼的複雜度為 O(n log n)
輸入 - 包含不同字元的字串,例如“ACCEBFFFFAAXXBLKE”
輸出 - 不同字元的程式碼
Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
演算法
huffmanCoding(字串)
輸入 - 包含不同字元的字串。
輸出 - 每個字元的程式碼。
Begin define a node with character, frequency, left and right child of the node for Huffman tree. create a list ‘freq’ to store frequency of each character, initially all are 0 for each character c in the string do increase the frequency for character ch in freq list. done for all type of character ch do if the frequency of ch is non zero then add ch and its frequency as a node of priority queue Q. done while Q is not empty do remove item from Q and assign it to left child of node remove item from Q and assign to the right child of node traverse the node to find the assigned code done End
traverseNode(n: 節點, 程式碼)
輸入 - 霍夫曼樹的節點 n,以及先前呼叫分配的程式碼
輸出 - 分配給每個字元的程式碼
if left child of node n ≠ φ then traverseNode(leftChild(n), code+’0’) //traverse through the left child traverseNode(rightChild(n), code+’1’) //traverse through the right child else display the character and data of current node.
示例
#include<iostream>
#include<queue>
#include<string>
using namespace std;
struct node{
int freq;
char data;
const node *child0, *child1;
node(char d, int f = -1){ //assign values in the node
data = d;
freq = f;
child0 = NULL;
child1 = NULL;
}
node(const node *c0, const node *c1){
data = 0;
freq = c0->freq + c1->freq;
child0=c0;
child1=c1;
}
bool operator<( const node &a ) const { //< operator performs to find priority in queue
return freq >a.freq;
}
void traverse(string code = "")const{
if(child0!=NULL){
child0->traverse(code+'0'); //add 0 with the code as left child
child1->traverse(code+'1'); //add 1 with the code as right child
}else{
cout << "Data: " << data<< ", Frequency: "<<freq << ", Code: " << code<<endl;
}
}
};
void huffmanCoding(string str){
priority_queue<node> qu;
int frequency[256];
for(int i = 0; i<256; i++)
frequency[i] = 0; //clear all frequency
for(int i = 0; i<str.size(); i++){
frequency[int(str[i])]++; //increase frequency
}
for(int i = 0; i<256; i++){
if(frequency[i]){
qu.push(node(i, frequency[i]));
}
}
while(qu.size() >1){
node *c0 = new node(qu.top()); //get left child and remove from queue
qu.pop();
node *c1 = new node(qu.top()); //get right child and remove from queue
qu.pop();
qu.push(node(c0, c1)); //add freq of two child and add again in the queue
}
cout << "The Huffman Code: "<<endl;
qu.top().traverse(); //traverse the tree to get code
}
main(){
string str = "ACCEBFFFFAAXXBLKE"; //arbitray string to get frequency
huffmanCoding(str);
}輸出
The Huffman Code: Data: K, Frequency: 1, Code: 0000 Data: L, Frequency: 1, Code: 0001 Data: E, Frequency: 2, Code: 001 Data: F, Frequency: 4, Code: 01 Data: B, Frequency: 2, Code: 100 Data: C, Frequency: 2, Code: 101 Data: X, Frequency: 2, Code: 110 Data: A, Frequency: 3, Code: 111
廣告
資料結構
網路
關係資料庫管理系統
作業系統
Java
iOS
HTML
CSS
Android
Python
C 程式設計
C++
C#
MongoDB
MySQL
Javascript
PHP