MPEG-1 Compression :MPEG-1 Compression
Announcements :Announcements
Ubiquitous Use of Digital Video :Ubiquitous Use of Digital Video Application domains
video conferencing, home video production, multimedia e-mail, electronic art, and more
Main obstacle is the raw size of digital video and the lack of a standard compression format
e.g., 640 x 480 video, 24 bits per pixel, 30 fps requires about 27.6 Mb/sec; or about 100 GB/hr
Many compression schemes, but a standard is needed to facilitate adoption, interoperability
Requirements :Requirements Allow balance of compression and quality
Apply to almost any digital video content
Tractable computational complexity
Robustness to errors
Synchronization
Support user interactions within a stream
editing, search, access
MPEG History :MPEG History Began in 1988
grew from 15 to 150 participants
17 companies; ATT, NEC, Intel, IBM, Sony, …
MPEG and JPEG part of same ISO group
MPEG was thought of as JPEG, but with removal of temporal redundancy
Wanted good quality at 1.5Mbs (CD)
driving focus of the standard
MPEG-1 Components :MPEG-1 Components Video: describes compression of frames
Audio: describes compression of audio frames
System: describes synchronization and multiplexing
Supports several aspect ratios
1:1 (CRT) ; 4:3 (NTSC); 16:9 (HDTV)
And refresh frequencies
23.976, 24, 25, 29.97, 50, 59.94, 60 Hz
Compression Techniques :Compression Techniques Video is a temporal series of still images
Spatial compression
transform blocks to frequency domain and remove high-frequency detail
Temporal compression
predict motion between frames and encode pointers to previous and future frames
provides majority of the compression
Temporal Compression Insight :Temporal Compression Insight (a) (b) (c) (d)
Three Types of Frames :Three Types of Frames Intra frames (same as JPEG)
typically about 12 frames between I frames
Predictive frames
encode from previous I or P reference frame
Bi-directional frames
encode from previous and future I or P frames I P I P P B B B B B B B B
Frame Order :Frame Order Prediction causes dependency issues
B-frames depend on future frames
Decode frames out of order I1 P1 I2 P2 P3 B1 B2 B3 B4 B5 B6 B7 B8 B2 I1 P1 I2 P2 P3 B1 B3 B4 B5 B6 B7 B8
Selecting I, P, or B Frames :Selecting I, P, or B Frames Heuristics
change of scenes should generate I frame
limit B and P frames between I frames
B frames are computationally intense
Compression Steps :Compression Steps Prepare the image for compression
transform color space (YUV)
down-sample color components
partition into macro (16x16) and blocks (8x8)
Further steps take one of two paths:
2D DCT encoding, or
motion compensation
Transform Color Space :Transform Color Space MPEG requires YUV
same space as YIQ, but rotated by 33 deg. U-V plane at Y=0.5
Down-sample :Down-sample MPEG optimized for 352 x 240 at 30 fps
derived from CCIR-601 digital television standard used in professional equipment
higher resolutions supported
Requires 4:2:2 down-sampling
176 x 120 pixels in chrominance components
Blocks and Macro-blocks :Blocks and Macro-blocks Block
Y,U, and V cut into 8x8 pixel regions
Macro-block
In Y component, a macro-block refers to a 16x16 pixel region, or four 8x8 blocks
In U and V components, a macro-block refers to two 8x8 regions, or two blocks
Organized in row order; top to bottom
DCT Encoding :DCT Encoding Applied to
each block in an I frame
each macro-block in a P or B frame that has no match to a reference reference
Identical to JPEG encoding
perform a DCT transform
quantize resulting coefficients
perform a zig-zag ordering
apply entropy encoding
Motion Compensation :Motion Compensation Each object moves from frame to frame, while maintaining the same color value
Provides most of MPEG’s compression
Asymmetric process
Requires search in both time and space
Evaluating if blocks match is difficult
Motion Compensation :Motion Compensation Point (x,y) in frame n with intensity In(x,y) corresponds to point (x’,y’) in frame n-1
In(x,y) = In-1(x’,y’)
Displacement (motion vector)
d = (dx,dy) = (x,y) – (x’,y’)
Extend idea to macro-blocks
Search Restrictions :Search Restrictions Space
P and B frames have no spatial restriction
Time
P restricted to previous I or P frame
B restricted to previous/subsequent I or P
Employ different algorithms
balance performance with quality
Exhaustive Search :Exhaustive Search An obvious, brute force solution
Computationally expensive
linear with respect to frame resolution
Logarithmic Search :Logarithmic Search Search a sample of the search window
Restrict search to more promising areas
could result in false negatives (didn’t find a matching block, even though it existed)
Predictive Search :Predictive Search Use previous macroblock’s motion vector as a starting point for search (space)
Or, use motion vector from same block in previous frame as starting point (time)
Error Metrics :Error Metrics SSD metric
SAD metric
Minimum error represents best match
must be below a specified threshold
error and perceptual similarity not always correlated
Macroblock (MB) Found :Macroblock (MB) Found Compute motion vector between MBs
encode as (x,y) offset from top left of current MB
positive values indicate right and down
Compute error difference between the two MBs
results in a matrix of difference values (mostly 0)
Apply DCT to difference matrix
Add the motion vector and transformed error difference to the resulting bit stream
Macroblock Not Found :Macroblock Not Found Apply standard DCT encoding to the blocks within the macroblock
If no motion vector is present, then motion vector is understood to be (0,0)
Adaptive Quantization :Adaptive Quantization After DCT transform, quantize coefficients
Controlled by two parameters:
quantization table and quantization factor
Use quantization factor to adapt bitrate
adjust factor to scale bitrate in real-time; larger value means lower bitrate
Utilize to give constant bit rate encoding
Quantization Tables :Quantization Tables Two tables; intra and non-intra blocks
Non-intra table
default is “16” in all coefficient positions
little relation between frequency and quality Intra Quantization Table
Slide 28:From Ramin Zabih’s lecture notes for CS 631 GOP GOP ... Seq Seq Seq … Seq SeqSC Video
Param Bitstream
Param QT,
misc Pict Pict ... GOPSC GOP
Param Time
Code MB MB ... SSC QScale Vert
Pos Slice Slice ... PSC Type Buffer
Param Encode
Param CBP b5 ... Addr Type Motion
Vector QScale b0 GOP Layer Sequence Layer Picture Layer Slice Layer Macro-block Layer Block Layer
Synchronization :Synchronization Interleave audio and video packets; insert time stamp into each “frame” of data
Terminology
SCR: system clock reference
DTS: decoding time stamp
PTS: presentation time stamp
SCR defined by a 90K Hz crystal
Synchronization :Synchronization During encoding:
insert SCR values into system stream
stamp each “frame” with PTS and DTS
use encode time to approx. decode time
During decoding:
initialize local decoder clock with start value
compare PTS to the value of local clock
periodically synchronize local clock to SCR