Data Compression

Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

By: SardarBilal (35 month(s) ago)

Thanks every body. if any one need presentation about data compression or any other topic related to database and data mining then email me plzzzzzzzzz (smbilal_84@yahoo.com).

By: baba1r1 (35 month(s) ago)

Aoa..Hi....please give me permission to download it......... or please mail it to me on my mail id babarshazad06@gmail.com.com thanku

By: indu_sharma (37 month(s) ago)

hey..Hi....please give me permission to download it......... or please mail it to me on my mail id frnd_200827@yahoo.com thanku

By: ajmeera (37 month(s) ago)

its very easy

By: sughoshbs (42 month(s) ago)

hi,It is very easy to understand ,please give the permission to download this file or else mail to me my mail-id.:-sughosh26@gmail.com............ thank u

See all

Presentation Transcript

Data Compression : 

12/4/2009 Department of Computer Science, UOP 1 Data Compression Bilal Khan

Data Compression : 

12/4/2009 Department of Computer Science, UOP 2 Data Compression Definition It is the process of reducing the amount of data required to represent a given quantity of information.

Data Compression : 

12/4/2009 Department of Computer Science, UOP 3 Data Compression If it contains data that either provide no relevant information or simply which is already known then it is said to contain data redundancy. Data compression is achieved when redundancies are reduced or eliminated.

Data Compression : 

12/4/2009 Department of Computer Science, UOP 4 Data Compression Redundancy Adjacent audio samples are similar. In digital image, neighboring samples on a scanning line are normally similar. In digital video, in addition to spatial redundancy, neighboring images in a video sequence may be similar.

Data Compression : 

12/4/2009 Department of Computer Science, UOP 5 Data Compression Objectives To reduce the volume of data to be transmitted (text, fax, images). Reduce storage space required. Reduce length of data transmission time over the network.

Categories of Data Compression : 

12/4/2009 Department of Computer Science, UOP 6 Categories of Data Compression Lossless Compression If the compressed data is reconstructed and the original data is obtain without any loss of information then this technique is called Lossless compression. Popular algorithms: LZW(Lempel-Ziv-Welch), RLE(Run Length Encoding), Huffman coding, Arithmetic Coding, Delta Encoding.

Categories of Data Compression : 

12/4/2009 Department of Computer Science, UOP 7 Categories of Data Compression Lossy Compression If the compressed data is reconstructed and the approximation of original data is obtain and loss of information occur then this technique is called Lossy compression. The original message can never be recovered exactly as it was before it was compressed.

Categories of Data Compression : 

12/4/2009 Department of Computer Science, UOP 8 Categories of Data Compression Original Data Compressed Data lossless Original Data Approximated lossy

Data Compression : 

12/4/2009 Department of Computer Science, UOP 9 Data Compression Applications Data Compression can be used in every where. Image Compression (JPG Images) Audio Compression (MP3’s Audio) Video Compression (DVD’s) General Data Compression (Zip File)

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 10 Data Compression Algorithm Lossless Data Huffman encoding Run Length Encoding Lempel-Ziv-Welch Encoding

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 11 Data Compression Algorithm Huffman Encoding ASCII encoding is the example of fix length encoding, where each character is representing the same number of bits. Variable length encodings used to reduce the number of bits, where different length for different character.

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 12 Data Compression Algorithm Huffman Encoding Example Text: Java Encoding: a = “0”, j = “11”, v= “10” Encoded text: 110100 (6 bit) Decode Problem Encoding: a = “0”, j = “01”, v= “00” Encoded text: 010000 (6 bits) Could be java, jvv, jaaaa

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 13 Data Compression Algorithm Huffman Encoding To prevent ambiguities in decoding, we require that the encoding satisfies the prefix rules: no code is prefix to other a = “0”, j = “11”, v = “10” Satisfied the rule a = “0”, j = “11”, v = “10” does not Satisfied the rule

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 14 Data Compression Algorithm Huffman Encoding Example A = 010 B =11 C = 00 D = 10 R = 011 C D B A R 0 0 1 1 0 1 0 1

Slide 15: 

12/4/2009 Department of Computer Science, UOP 15 Huffman Encoding Example of Decoded Encoded Text: 01011011010000101001011011010 Text: ABRACADABRA

Slide 16: 

12/4/2009 Department of Computer Science, UOP 16 Huffman Encoding: Text: ABRACADABRA Character Frequencies: A B R C D 5 2 2 1 1 5 2 2 1 1 2 5 2 2 1 1 2 4 D C

Huffman Encoding : 

12/4/2009 Department of Computer Science, UOP 17 Huffman Encoding Character Frequencies: A B R C D 2 2 1 1 2 4 D C 5 2 2 1 1 2 4 D C 5 6 2 2 1 1 2 4 D C 5 6 11 R B A 0 1 1 1 1 0 0 0

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 18 Data Compression Algorithm Run Length Encoding Repeated occurrence of the same character is called a run Number of repetition is called the length of the run. It is called run-length because a run is made for repeated bits and coded in lesser bits by only stating how many bits were there.

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 19 Data Compression Algorithm Run Length Encoding Example A file with 0 as repeating character Two characters in the compressed file replace each run of zeros. For the first 3 repeating 0’s in original file, the first encoded stream In compressed file is showing that ’0’ was repeating ’3’ times.

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 20 Data Compression Algorithm Run Length Encoding Original Data Stream: 8 54 0 0 0 0 97 5 16 0 45 0 0 0 0 0 0 67 0 0 8 8 54 0 3 97 5 16 0 1 45 23 0 6 67 0 2 8 Run Length Encoded

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 21 Data Compression Algorithm Lempel-Ziv-Welch (LZW) Encoding Uses a dictionary or code table. Done by constructing a "dictionary" of words or parts of words in a message, and then using pointers to the words in the dictionary. LZW to compress text, executable code, and similar data files to about one-half their original size.

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 22 Data Compression Algorithm Lempel-Ziv-Welch (LZW) Encoding Example The rain in Spain fall mainly on the plain The string "ain" can be stored in the dictionary and then pointed to when it repeats.

Data Compression Algorithm : 

12/4/2009 Department of Computer Science, UOP 23 Data Compression Algorithm Lempel-Ziv-Welch (LZW) Encoding Example

Wavelet Transform : 

12/4/2009 Department of Computer Science, UOP 24 Wavelet Transform It is a lossy data compression technique. This technique is applied when one data vector D is transform to an other vector D’ of the same size. Compressed approximation: store only a small fraction of the strongest of the wavelet coefficients Similar to discrete Fourier transform (DFT), but better lossy compression, localized in space.

Wavelet Transform : 

12/4/2009 Department of Computer Science, UOP 25 Wavelet Transform Length L, must be an integer power of 2 Each transform has 2 functions: smoothing, difference Applies to pairs of data, resulting in two set of data of length L/2 Applies two functions recursively, until reaches the desired length

1D 4-pixel Image : 

12/4/2009 Department of Computer Science, UOP 26 1D 4-pixel Image [ 9 7 3 5] (9 + 7)/2 (3 + 5)/2 [ 8 4 ] (9 - 7)/2 (3 - 5)/2 [ 1 -1 ] Average (smoothing) Detail coefficients (edge detection) (8 + 4)/2 6 (8 – 4)/2 2 [ 6 2 1 -1 ] WT

Wavelet Application : 

12/4/2009 Department of Computer Science, UOP 27 Wavelet Application Fingerprint Record database The FBI digitizing the fingerprint database A single fingerprint card contain If 200 million cards then size of data is 10 MB of data! 2,000 terabytes!

Wavelet Application : 

12/4/2009 Department of Computer Science, UOP 28 Wavelet Application Fingerprint Record database The FBI decided to adopt a wavelet-based image coding algorithm as a national standard for digitized fingerprint records.

Image compression steps : 

12/4/2009 Department of Computer Science, UOP 29 Image compression steps Original image Source encoder linear transform to decorrelate the image data (lossless) Quantization of basis functions coefficients (lossy) Entropy Coding of the resulting quantized values(lossless) Compressed image (reconstructed) (inverse T) (dequantization) (decoding)

Image compression steps : 

12/4/2009 Department of Computer Science, UOP 30 Image compression steps Linear transformation We change the coordinate system in which we represent a signal in order to make it much better suited for processing (compression). We should be able to represent all the useful signal features and important phenomena in as compact manner as possible.

Image compression steps : 

12/4/2009 Department of Computer Science, UOP 31 Image compression steps Quantization A quantizer simply reduces the number of bits needed to store the transformed coefficients by reducing the precision of those values. It is a lossy process.

Image compression steps : 

12/4/2009 Department of Computer Science, UOP 32 Image compression steps Entropy encoding and decoding Once the quantization process is completed, the last encoding step is to use entropy coding to achieve the entropy rate of quantizer.

Image compression steps : 

12/4/2009 Department of Computer Science, UOP 33 Image compression steps Fingerprint Compression Result Original image 541832 bites 768x768 Reconstructed image (compressed file size 32702 bit)

Slide 34: 

12/4/2009 Department of Computer Science, UOP 34 THANKS