Resources » .NET programming » .NET Framework

Compression using LZW method


Posted Date: 18-Jan-2005  Last Updated:   Category: .NET Framework    
Author: Member Level: Bronze    Points: 10


Compression using LZW method in .NET. How to compress file using .NET. Explanation of Lempel, Ziv, Welch method. What is LZW Compression Technique. Introduction on LZW. Explain LZW with simple example. LZW technique for beginners. Tutorials on LZW method.



Introduction


LZW is one of the data compression method developed by Lempel, Ziv and Welch. The below is its algorithm explained in simple plain word :

Compression


Compression
Given an uncompressed string, text(q)text(p) where text(q) is the longest prefix of the uncompressed string found in dictionary and text(p) is the suffix of the uncompressed string.

To begin compress the string,
replaced the text(q) with q code where the dictionary key for q code must already be defined.
add new code inside the dictionary with key text(q)fc(p)where fc(p)represents the first character of the uncompressed string.
It sounds complicated. Let's take a look on the below example which will make your mind clear.
Consider the below uncompressed string and its predefined dictionary.

PPPQPPQQQ
Code 0 1
Key P Q


Here they go by using LZW compression method...

0 PPQPPQQQ - New Key "PP" was added since the 1st compressed P is followed by the 2nd P
Code 0 1 2
Key P Q PP

0 2 QPPQQQ - New Key "PPQ" was added since the 2nd compressed PP is followed by the 4th Q
Code 0 1 2 3
Key P Q PP PPQ


0 2 1 PPQQQ - New Key "QP" was added since the 4th compressed Q is followed the 5th P
Code 0 1 2 3 4
Key P Q PP PPQ QP


0 2 1 3 QQ - New Key "PPQQ" was added since the 5rd compressed PPQ is followed the 8th Q
Code 0 1 2 3 4 5
Key P Q PP PPQ QP PPQQ


0 2 1 3 1 Q - New Key "QQ" was added since the 8th compressed Q is followed the 9th Q
Code 0 1 2 3 4 5 6
Key P Q PP PPQ QP PPQQ QQ

Finally, the whole string PPPQPPQQQ is compressed to 0 2 1 3 1 1
Originally, the size of the string PPPQPPQQQ is 9 x 8 bits (size of ASCII) = 72 bits.

For the compressed string, if we define the code size as 4 bits in the beginning (of course, the 4 bits is too small in the real world application but sufficient enough for our example), the size of the compressed string 0 2 1 3 1 1 will be 6 x 4 bits = 24 bits. The compressed ration is 3 which will make everyone happy on it !

It is important to note that the dictionary is built during the execution of the compressor. It will be removed from the memory after the compressor finish its execution. Only 0 2 13 1 1 will be written on the disk and not the dictionary.

Decompression


To decompress the compressed the string, consider the qp code.

Case 1 : P code is found in dictionary. Replaced p with text(p). Add new code into dictionary based on key text(q)fc(p)
Case 2 : P code is not found in dictionary. Replaced p with text(q)fc(q). Add new code into dictionary based on key text(q)fc(q)
Consider the below code and its predefined dictionary.
0 2 13 1 1
Code 0 1
Key P Q

Here they go by using LZW decompression method...


It is noted that the first code will always defined in the dictionary. Thus, we will get
P 2 13 1 1 - Since this is the first code that we decompress. We will just left it without adding any new code to dictionary.
Code 0 1
Key P Q


PPP 1 3 1 1 - Since 2 is not found in dictionary and it is case 2. We will replace 2 with text(0)fc(0) which is equivalent to PP. New code added into dictionary based on key text(0)fc(0) which is equivalent to PP.
Code 0 1 2
Key P Q PP


PPPQ 3 1 1 - Since 1 is found in dictionary and it is case 1. We will replace 1text(1)which is equivalent to Q. New code added into dictionary based on key text(2)fc(1) which is equivalent to PPQ.
Code 0 1 2 3
Key P Q PP PPQ


PPPQPPQ 1 1 - Since 3 is found in dictionary and it is case 1. We will replace 3text(3)which is equivalent to PPQ. New code added into dictionary based on key text(1)fc(3) which is equivalent to QP.
Code 0 1 2 3 4
Key P Q PP PPQ QP



PPPQPPQQ 1 - Since 1 is found in dictionary and it is case 1. We will replace 1text(1)which is equivalent to Q. New code added into dictionary based on key text(3)fc(1) which is equivalent to PPQQ.
Code 0 1 2 3 4 5
Key P Q PP PPQ QP PPQQ



PPPQPPQQQ - Since 1 is found in dictionary and it is case 1. We will replace 1text(1)which is equivalent to Q. New code added into dictionary based on key text(1)fc(1) which is equivalent to QQ.
Code 0 1 2 3 4 5 6
Key P Q PP PPQ QP PPQQ QQ


Compress.cs



using System;

public class Compress
{
private static System.String[] Files
{
set
{
System.String inputFile, outputFile;
if (value.Length >= 1)
{
inputFile = value[0];
in_Renamed = new System.IO.BufferedStream(new System.IO.FileStream(inputFile, System.IO.FileMode.Open, System.IO.FileAccess.Read));
outputFile = inputFile + ".zzz";
out_Renamed = new System.IO.BufferedStream(new System.IO.FileStream(outputFile, System.IO.FileMode.Create));
}
else
{
System.Console.Out.Write("usage:java Compress ");
System.Environment.Exit(1);
}
}

}
internal const int MAX_CODES = 4096;
internal const int BYTE_SIZE = 8;
internal const int EXCESS = 4;
internal const int ALPHA = 256;
internal const int MASK1 = 255;
internal const int MASK2 = 15;

internal static int leftOver;
internal static bool bitsLeftOver;
internal static System.IO.BufferedStream in_Renamed;
internal static System.IO.BufferedStream out_Renamed;

private static void output(int pcode)
{
int c, d;
if (bitsLeftOver)
{
d = pcode & MASK1;
c = (leftOver << EXCESS) + (pcode >> BYTE_SIZE);
out_Renamed.WriteByte((System.Byte) c);
out_Renamed.WriteByte((System.Byte) d);
bitsLeftOver = false;
}
else
{
leftOver = pcode & MASK2;
c = pcode >> EXCESS;
out_Renamed.WriteByte((System.Byte) c);
bitsLeftOver = true;
}
}

private static void compress()
{
System.Collections.Hashtable table = new System.Collections.Hashtable();
for (int i = 0; i < ALPHA; i++)
SupportClass.PutElement(table, i, i);

int codeUsed = ALPHA;

int c = in_Renamed.ReadByte();
if (c != - 1)
{
int pcode = c;
c = in_Renamed.ReadByte();
while (c != - 1)
{
int k = (pcode << BYTE_SIZE) + c;
System.Int32 e = (System.Int32) table[k];
if ((System.Object) e == null)
{
output(pcode);
if (codeUsed < MAX_CODES)
SupportClass.PutElement(table, (pcode << BYTE_SIZE) + c, codeUsed++);
pcode = c;
}
else
pcode = e;
c = in_Renamed.ReadByte();
}

output(pcode);
if (bitsLeftOver)
out_Renamed.WriteByte((System.Byte) (leftOver << EXCESS));
}
//UPGRADE_TODO: Method 'java.io.BufferedInputStream.close' was not converted. 'ms-help://MS.VSCC.2003/commoner/redir/redirect.htm?keyword="jlca1095"'
//System.IO.BufferedStream.Null ;
in_Renamed.Close();
out_Renamed.Close();
}

[STAThread]
public static void Main(System.String[] args)
{
Files = args;
compress();
}
}


Decompress.cs



using System;

public class Decompress
{
private static System.String[] Files
{
set
{
System.String inputFile, outputFile;
if (value.Length >= 1)
{
inputFile = value[0];
if (!inputFile.EndsWith(".zzz"))
{
System.Console.Out.WriteLine("The filename must end with \"zzz\" extension");
System.Environment.Exit(1);
}
in_Renamed = new System.IO.BufferedStream(new System.IO.FileStream(inputFile, System.IO.FileMode.Open, System.IO.FileAccess.Read));
outputFile = inputFile.Substring(0, (inputFile.Length - 4) - (0));
out_Renamed = new System.IO.BufferedStream(new System.IO.FileStream(outputFile, System.IO.FileMode.Create));
}
else
{
System.Console.Out.Write("usage:java Decompress ");
System.Environment.Exit(1);
}
}

}
private static int Code
{
get
{
int c = in_Renamed.ReadByte();
if (c == - 1)
return - 1;

int code;
if (bitsLeftOver)
code = (leftOver << BYTE_SIZE) + c;
else
{
int d = in_Renamed.ReadByte();
code = (c << EXCESS) + (d >> EXCESS);
leftOver = d & MASK;
}
bitsLeftOver = !bitsLeftOver;
return code;
}

}

internal const int MAX_CODES = 4096;
internal const int BYTE_SIZE = 8;
internal const int EXCESS = 4;
internal const int ALPHA = 256;
internal const int MASK = 15;

internal static int[] s;
internal static int size;
internal static Element[] h;
internal static int leftOver;
internal static bool bitsLeftOver;
internal static System.IO.BufferedStream in_Renamed;
internal static System.IO.BufferedStream out_Renamed;

private static void output(int codes)
{
size = - 1;
while (codes >= ALPHA)
{
s[++size] = h[codes].suffix;
codes = h[codes].prefix;
}
s[++size] = codes;
for (int i = size; i >= 0; i--)
out_Renamed.WriteByte((System.Byte) s[i]);
}

private static void decompress()
{
int codeUsed = ALPHA;
s = new int[MAX_CODES];
h = new Element[MAX_CODES];

int pcode = Code, ccode;
if (pcode >= 0)
{
s[0] = pcode;
out_Renamed.WriteByte((System.Byte) s[0]);
size = 0;

do
{
ccode = Code;
if (ccode < 0)
break;
if (ccode < codeUsed)
{
output(ccode);
if (codeUsed < MAX_CODES)
h[codeUsed++] = new Element(pcode, s[size]);
}
else
{
h[codeUsed++] = new Element(pcode, s[size]);
output(ccode);
}
pcode = ccode;
}
while (true);
}
out_Renamed.Close();
//UPGRADE_TODO: Method 'java.io.BufferedInputStream.close' was not converted. 'ms-help://MS.VSCC.2003/commoner/redir/redirect.htm?keyword="jlca1095"'
in_Renamed.Close();
}

[STAThread]
public static void Main(System.String[] args)
{
Files = args;
decompress();
}
}


SupportClass.cs



//
// In order to convert some functionality to Visual C#, the Java Language Conversion Assistant
// creates "support classes" that duplicate the original functionality.
//
// Support classes replicate the functionality of the original code, but in some cases they are
// substantially different architecturally. Although every effort is made to preserve the
// original architecture of the application in the converted project, the user should be aware that
// the primary goal of these support classes is to replicate functionality, and that at times
// the architecture of the resulting solution may differ somewhat.
//

using System;

///
/// Contains conversion support elements such as classes, interfaces and static methods.
///

public class SupportClass
{
///
/// Adds a new key-and-value pair into the hash table
///

/// The collection to work with
/// Key used to obtain the value
/// Value asociated with the key
/// The old element associated with the key
public static System.Object PutElement(System.Collections.IDictionary collection, System.Object key, System.Object newValue)
{
System.Object element = collection[key];
collection[key] = newValue;
return element;
}

}



Element.cs



using System;
class Element
{
internal int prefix;
internal int suffix;

public Element(int prefix, int suffix)
{
this.prefix = prefix;
this.suffix = suffix;
}
}


Did you like this resource? Share it with your friends and show your love!

Responses to "Compression using LZW method"
Author: Ramon Hernandez    05 Jul 2005Member Level: Bronze   Points : 0
Hello
I have not been able to compile Decompress.cs. The code must be incomplete or with some error.
It could say to me if there is some difference between algiritmo Lempel-Ziv (LZ) and the Lempel, Ziv, Welch (LZW)
Thank you very much



Author: Suteu Cezar    19 Jan 2009Member Level: Bronze   Points : 2
Indeed Decompress.cs is flawed. There is a small mistake due to the formatting of the frames.

s[++size] = h[ code ].suffix;
code = h[ code ].prefix;

Also System.Int32 e = (System.Int32) table[k]; might not work, i have a workaround just use this block of code instead:

if (!table.ContainsKey(k))
{
output(pcode);
if (codeUsed < MAX_CODES)
SupportClass.PutElement(table, (pcode << BYTE_SIZE) + c, codeUsed++);
pcode = c;
}
else
{
System.Int32 e = (System.Int32)table[k];
pcode = e;
}

Great solid code otherwise!



Author: Raj Krishna    16 Apr 2009Member Level: Gold   Points : 0
http://rajkrishnaj.blogspot.com/


Feedbacks      

Post Comment:




  • Do not include your name, "with regards" etc in the comment. Write detailed comment, relevant to the topic.
  • No HTML formatting and links to other web sites are allowed.
  • This is a strictly moderated site. Absolutely no spam allowed.
  • Name:   Sign In to fill automatically.
    Email: (Will not be published, but required to validate comment)



    Type the numbers and letters shown on the left.


    Submit Article     Return to Article Index

    Subscribe to Subscribers
    Active Members
    TodayLast 7 Daysmore...

    Awards & Gifts
    Talk to Webmaster Tony John
    Copyright © SpiderWorks Technologies Pvt Ltd., Kochi, India