新聞中心
這篇文章將為大家詳細(xì)講解有關(guān)java如何實(shí)現(xiàn)哈夫曼壓縮與解壓縮功能,小編覺(jué)得挺實(shí)用的,因此分享給大家做個(gè)參考,希望大家閱讀完這篇文章后可以有所收獲。

一哈夫曼樹(shù)以及文件壓縮原理:
1.哈夫曼樹(shù) :
給定N個(gè)權(quán)值作為N個(gè)葉子結(jié)點(diǎn),構(gòu)造一棵二叉樹(shù),若該樹(shù)的帶權(quán)路徑長(zhǎng)度達(dá)到最小,稱(chēng)這樣的二叉樹(shù)為最優(yōu)二叉樹(shù),也稱(chēng)為哈夫曼樹(shù)。哈夫曼樹(shù)是帶權(quán)路徑長(zhǎng)度短的樹(shù),權(quán)值較大的結(jié)點(diǎn)離根較近(頻率越高的結(jié)點(diǎn)離根越進(jìn))。
以 下數(shù)組為例,構(gòu)建哈夫曼樹(shù)
int a[] = {0,1,2,3,4,5,6,7,8}
我們可以發(fā)現(xiàn)以下規(guī)律
1:9個(gè)數(shù)構(gòu)成的哈夫曼樹(shù)一共有17個(gè)結(jié)點(diǎn),也就是可以n個(gè)數(shù)可以生產(chǎn)2*n-1個(gè)結(jié)點(diǎn)
2:數(shù)字越大的數(shù)離根節(jié)點(diǎn)越近,越小的數(shù)離根節(jié)點(diǎn)越近。
2.如何利用haffman編碼實(shí)現(xiàn)文件壓縮:
比如abc.txt文件中有以下字符:aaaabbbccde,
1.進(jìn)行字符統(tǒng)計(jì)
aaaabbbccde a : 4次b : 3次c : 2次d : 1次e : 1次
2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹(shù)
3.用哈夫曼樹(shù)生成哈夫曼編碼(從根結(jié)點(diǎn)開(kāi)始,路徑左邊記為0,右邊記為1):
a的編碼:1b的編碼:01c的編碼:000d的編碼:0011e的編碼:0010
4.哈夫曼編碼代替字符,進(jìn)行壓縮。
源文件內(nèi)容為:aaaabbbccde將源文件用對(duì)應(yīng)的哈夫曼編碼(haffman code)替換,則有:11110101 01000000 00110010 (總共3個(gè)字節(jié))
由此可見(jiàn),源文件一共有11個(gè)字符,占11字節(jié)的內(nèi)存,但是經(jīng)過(guò)用haffman code替換之后,只占3個(gè)字節(jié),這樣就能達(dá)到壓縮的目的
二主要技術(shù)點(diǎn):
1.哈夫曼樹(shù)算法(哈夫曼壓縮的基本算法)
2.哈希算法(字符統(tǒng)計(jì)時(shí)候會(huì)用到,也可以直接用HashMap統(tǒng)計(jì))
3.位運(yùn)算(涉及到將指定位,置0或置1)
4.java文件操作,以及緩沖操作。
5.存儲(chǔ)模式(大端存儲(chǔ),小端存儲(chǔ),能看懂文件16進(jìn)制的形式)
7.設(shè)置壓縮密碼,解壓輸入密碼解壓(小編自己加的內(nèi)容)
三實(shí)現(xiàn)過(guò)程:
以上述aaaabbbccde為例
1.字符統(tǒng)計(jì):
public class FreqHuf {public static int BUFFER_SIZE = 1 << 18;int freq[] = new int[256];File file;int count;List
//統(tǒng)計(jì)每個(gè)字符和其頻率的類(lèi)public class HuffmanFreq {char character;int freq;HuffmanFreq() {}HuffmanFreq(int character,int freq) {this.character = (char)character;this.freq = freq;} char getCharacter() {return character;} void setCharacter(int character) {this.character = (char)character;} int getFreq() {return freq;} void setFreq(int freq) {this.freq = freq;}byte[] infoToByte(){byte[] bt = new byte[6];byte[] b1 = ByteAnd8Types.charToByte(character);for(int i= 0; i < b1.length;i++){bt[i] = b1[i];}byte[] b2 = ByteAnd8Types.intToBytes2(freq);int index = 2;for(int i= 0; i < b2.length;i++){bt[index++] = b2[i];}return bt;} @Overridepublic String toString() {return "Huffman [character=" + character + ", freq=" + freq + "]";}}
2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹(shù):
//treeSize為總節(jié)點(diǎn)數(shù)private void creatTree(int treeSize){int temp;treeList = new ArrayList
3.用哈夫曼樹(shù)生成哈夫曼編碼(從根結(jié)點(diǎn)開(kāi)始,路徑左邊記為0,右邊記為1):
private void bulidhuftreecode(int root, String str){if(treeList.get(root).getLeft() != -1 && treeList.get(root).getRight() != -1){bulidhuftreecode(treeList.get(root).getLeft(), str+"0");bulidhuftreecode(treeList.get(root).getRight(), str + "1");}else{treeList.get(root).code = str;}}
4.哈夫曼編碼代替字符,進(jìn)行壓縮,壓縮前首先要將文件頭(文件標(biāo)志,字符數(shù)量,最后一個(gè)字節(jié)有效位,密碼)字符和其頻率的那張表格寫(xiě)入文件,以便于解壓縮
public void creatCodeFile(String path) throws Exception{byte value = 0;int index = 0;int arr[] = new int[256];int intchar;for(int i = 0; i < charCount; i++){arr[treeList.get(i).freq.character] = i;}File file = new File(path); if(!file.exists()){ if(!file.createNewFile()){ throw new Exception("創(chuàng)建文件失敗"); } }int count = charList.size();HuffmanHead head = new HuffmanHead(count, howlongchar(count), password); //將文件頭信息寫(xiě)入文件this.write = new RandomAccessFile(file, "rw");write.write(head.InfoToByte()); //將字符及其頻率的表寫(xiě)入文件for(HuffmanFreq freq : charList){byte[] bt = freq.infoToByte();write.write(bt);}//將字符用哈夫曼編碼進(jìn)行壓縮,這里讀寫(xiě)都是采用緩存機(jī)制byte[] readBuffer = new byte[BUFFER_SIZE];while((intchar = read.read(readBuffer))!= -1){ProgressBar.SetCurrent(read.getFilePointer());for(int i = 0; i < intchar;i++){int temp = readBuffer[i]& 0xff; String code = treeList.get(arr[temp]).code;char[] chars = code.toCharArray();for(int j = 0; j < chars.length; j++){if(chars[j] == '0'){value = CLR_BYTE(value, index);}if(chars[j] == '1'){value = SET_BYTE(value, index);}if(++index >= 8){index = 0;writeInBuffer(value);}}}}//此方法速度較慢//while((intchar = is.read()) != -1){//String code = treeList.get(arr[intchar]).code;//char[] chars = code.toCharArray();////for(int i = 0; i < chars.length; i++){//if(chars[i] == '0'){//value = CLR_BYTE(value, index);//}//if(chars[i] == '1'){//value = SET_BYTE(value, index);//}//if(++index >= 8){//index = 0;//oos.write(value);//}//}//}if(index != 0){writeInBuffer(value);} byte[] Data = Arrays.copyOfRange(writeBuffer, 0, writeBufferSize); write.write(Data); write.close();read.close();} //指定位,置1 byte SET_BYTE(byte value, int index){return (value) |= (1 << ((index) ^ 7));} //指定位,置0byte CLR_BYTE(byte value, int index){ return (value) &= (~(1 << ((index) ^ 7)));} //判斷指定位是否為0,0為false,1為trueboolean GET_BYTE(byte value, int index){ return ((value) & (1 << ((index) ^ 7))) != 0;}
如果一個(gè)字節(jié)一個(gè)字節(jié)往文件里寫(xiě),速度會(huì)極慢,為了提高效率,寫(xiě)也采用緩存,先寫(xiě)到緩存區(qū),緩存區(qū)滿(mǎn)了后寫(xiě)入文件,
private void writeInBuffer(byte value) throws Exception {if(writeBufferSize < BUFFER_SIZE){writeBuffer[writeBufferSize] = value;if(++writeBufferSize >= BUFFER_SIZE){write.write(writeBuffer);writeBufferSize = 0;}} else{throw new Exception("寫(xiě)入文件出錯(cuò)");}}
到這里壓縮就完成了,以下為解壓縮方法
1.從寫(xiě)入文件中的字符統(tǒng)計(jì)的表讀出放入list里
public void init() throws Exception{char isHUf = read.readChar(); //驗(yàn)證文件頭信息if(isHUf != '哈'){throw new Exception("該文件不是HUFFMAN壓縮文件");}this.charCount = read.readChar();this.treeSize = 2*charCount -1;this.lastIndex = read.readChar();int password = read.readInt();if(password != this.password.hashCode()){System.out.println("密碼錯(cuò)誤");} else{System.out.println("密碼正確,正在解壓");} //從文件中將字符統(tǒng)計(jì)的表讀出byte[] buffer = new byte[charCount * 6];read.seek(10);read.read(buffer, 0, charCount * 6);ProgressBar.SetCurrent(read.getFilePointer());for(int i = 0; i < buffer.length; i+=6){byte[] buff = Arrays.copyOfRange(buffer, i, i+2);ByteBuffer bb = ByteBuffer.allocate (buff.length); bb.put (buff); bb.flip (); CharBuffer cb = cs.decode (bb); byte[] buff1 = Arrays.copyOfRange(buffer, i+2, i+6); int size = ByteAnd8Types.bytesToInt2(buff1, 0); HuffmanFreq freq = new HuffmanFreq(cb.array()[0], size); charList.add(freq);}}
2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹(shù)(和以上代碼一樣)
3.用哈夫曼樹(shù)生成哈夫曼編碼(從根結(jié)點(diǎn)開(kāi)始,路徑左邊記為0,右邊記為1)(和以上代碼一樣)
4.遍歷文件每個(gè)字節(jié),根據(jù)哈夫曼編碼找到對(duì)應(yīng)的字符,將字符寫(xiě)入新文件
public void creatsourcefile(String pathname) throws Exception{int root = treeList.size() - 1;int fininsh = 1;long len;File file = new File(pathname);if(!file.exists()){ if(!file.createNewFile()){ throw new Exception("創(chuàng)建文件失敗"); }}write = new RandomAccessFile(file, "rw");int intchar;byte[] bytes = new byte[1<<18];int index = 0;while((intchar = read.read(bytes))!= -1){len = read.getFilePointer();ProgressBar.SetCurrent(len);for(int i = 0; i < intchar;i++){for(;index < 8 && fininsh != 0;){if(GET_BYTE(bytes[i], index)){root = treeList.get(root).right;} else{root = treeList.get(root).left;}if(treeList.get(root).right== -1 && treeList.get(root).left == -1){byte temp = (byte)treeList.get(root).freq.character;writeInBuffer(temp);root = treeList.size() - 1;}index++;if(len == this.goalfilelenth && i == intchar-1){if(index >= this.lastIndex){fininsh = 0;}}}index = 0;}}byte[] Data = Arrays.copyOfRange(writeBuffer, 0, writeBufferSize);write.write(Data);write.close();write.close();read.close();}
關(guān)于“java如何實(shí)現(xiàn)哈夫曼壓縮與解壓縮功能”這篇文章就分享到這里了,希望以上內(nèi)容可以對(duì)大家有一定的幫助,使各位可以學(xué)到更多知識(shí),如果覺(jué)得文章不錯(cuò),請(qǐng)把它分享出去讓更多的人看到。
名稱(chēng)欄目:java如何實(shí)現(xiàn)哈夫曼壓縮與解壓縮功能-創(chuàng)新互聯(lián)
當(dāng)前URL:http://www.dlmjj.cn/article/gpipi.html


咨詢(xún)
建站咨詢(xún)
