A short introduction to MPEG 2

and digital television

1. Overview of digital television * 1.1. MPEG 2 compression *
1.2. Basis of Digital Video *
1.2.1. The pictures *
1.2.2. Luminance and chrominance *
1.2.3. Video sequence *
1.2.4. Interlacing *
1.2.5. Standard TV formats *
1.2.6. Compression in MPEG 2 *
1.3. Set top box. *
2. Miscellaneous * Teletext * Widescreen: general information *
2.1.2. Summary of design manual *
2.1.3. A more complete approach on Teletext * The encoder * Issues *
Bibliography *  
  1. Overview of digital television
    1. MPEG 2 compression

In every numerical television broadcast standard, the MPEG 2 norm is used, at least for video. In audio, things are not that simple.
Standard ATSC DVB-S/C/T
Countries USA Europe
Typical use Terrestrial, cable Satellite, cable, terrestrial
Video coding MPEG2 Video MPEG2 Video
Audio coding AC3 MPEG2 Audio
Multiplex ATSC MPEG2 System

Historical perspective

In the eighties, it was generally assumed that numerical broadcasting of television would not happen before the end of this century; because the bandwidth needed was very important (108 to 270 Mb/s for a display of 525 or 625 lines). Besides, Japan, Europe and USA thought that customers would need a higher quality for the image; so the standards IDTV and HTDV (1050 to 1250 lines) were developed. They would have required 1 Gbit/s. So the High Definition Systems proposed were analogue (MUSE in Japan, HD-MAC in Europe, and some propositions in USA) with a digital help.

At the end of the eighties, many things changed:

So the first numeric broadcastings began in 1994 in States (DirecTV project). In Europe, the ELG (European Launching Group) gave birth to the DVB project using MPEG2. Canal + first and many others began to broadcast numerically.

The MPEG 2 Norm allows compressing video signal with very low quality loss. For example, for standard television, a signal from 124 Mbits/s is compressed to between 4 and 8 Mbits/s.

The MPEG transport stream

Contrary to the MPEG 1 and MPEG 2 program streams, which are dedicated to error free medium (CD-ROM, hard disks), the MPEG transport streams is dedicated to error prone mediums (television). So the length of the transport packets is fixed to 188 bytes. A transport packet divides into:
description packet

An MPEG transport stream is divided into several PES (packet elementary streams), which are divided into transport packets. Each PES is identified by its PID (Packet IDentifier). We will not describe here the different tables and information in MPEG norm, as it would be too complex and not as good as that you can find elsewhere. See [7] and [8].

    1. Basis of Digital Video
      1. The pictures

      2. A picture is a bidimensionnal array of pixels. Each pixel is characterised by its colour. In the computer world, the colour is always stored as a three-element vector, which gives the pixel intensity for each primary colour red, green and blue (R, G and B components). The intensity of each component is usually quantified on 8 bits in consumer products.
      3. Luminance and chrominance



        In the video world, the colours are not stored in the same way in order to save some bandwidth. Our eyes are less sensitive to the actual colour information than to the luminance intensity, which characterises the brightness of pixels. So a possible way to compress the video signal is to allocate more bandwidth (i.e. spatial resolution) to the luminance information and less bandwidth to the chrominance information. The luminance Y and the chrominance values Cb and Cr are deduced from the RGB values by a lineal translation defined by the 3x3 matrix M:

        Cb (respectively Cr) is usually the difference between the blue (respectively red) component and Y.

        The MPEG2 norm also uses the YUV components. Here are the formulas:

        Y= 0.587V+0.299R+0.114B

        Cb=0,564(B-Y) or U=0.493(B-Y)

        Cr=0.713(R-Y)or V=0.877(R-Y)

        Table 1: Chrominance subsampling formats
        Format name Horizontal subsampling Vertical subsampling Compression ratio Application
        4:4:4 1 1 1 High quality video
        4:2:2 2 1 1.5 Standard TV format
        4:2:0 2 2 2 Standard MPEG format


      5. Video sequence

      6. A video sequence is simply a succession of pictures. Each picture is shown during a fixed amount of time on the screen. In order to avoid flickering, the frequency at which the pictures are shown must be high enough (typically 50 Hz for NTSC and 60 Hz for PAL and SECAM). The cinema picture of 24 Hz is clearly not enough to reconstitute a smooth motion.
      7. Interlacing

      8. In analogue television, interlacing is used to save some bandwidth. It consists in transmitting (and displaying) the even lines then the odd lines of each picture. So the used bandwidth is divided by two.

        According to the MPEG vocabulary, a field is a picture where the vertical resolution has been divided by two. The top field contains the even lines (the first line being the line 0) and the bottom field contains the odd lines. A frame consists of a top field and a bottom field. Note that a frame is not necessary a coherent picture because the top and bottom fields are usually sampled at a different time.

        The word picture refers to either a field or a frame, depending on the context. In MPEG2, a picture corresponds to the compressed data associated with a picture header. It can be a field or a frame. An interlaced sequence is a sequence, which is field-based; i.e. each picture displayed is a field. A progressive sequence is a sequence where each picture displayed is a frame.

      9. Standard TV formats

      10. Table 2 - Standard TV formats
        Standard Horizontal resolution Vertical resolution Colour format Interlaced Frame rate (Hz)
        CCIR601-50 (PAL/SECAM) 720 576 4:2:2 Yes 25
        CCIR601-50 (NTSC) 720 480 4:2:2 Yes 30


      11. Compression in MPEG 2
MPEG is a compression standard for digital video, suitable for television broadcasting. It handles a wide range of video formats, interlaced or progressive, with different chrominance formats. Its main advantage is that the receiver has much less to do than the transmitter. It uses two compression methods: These are three kinds of pictures:
  I0 P3 B1 B2 P6 B4 P5

This would be displayed in that order:
  I0 B1 B2 P3 B4 P5 P6


    1. Set top box.
A set top box is, in its more general sense, a peripheral for a television; in this document a set top box will mean a decoder able to receive MPEG2 sequences via terrestrial, cable or satellite, and to output NTSC, PAL or SECAM signals suitable for any classic analogue television.

Inside the set top box, you generally find (here we take the example of the LSI Logic SDP-1000):

Basically, the transport chip parses the MPEG transport stream, sends the audio and video PES to the A/V decoder, and deals with the other PES too (for example, teletext PES).

On such boards, you have many constraints: everything must be affordable for the customer. So it is like a small PC except that you do not have much memory, an hard disk, a very fast CPU and so on.


    An MPEG sequence is usually decoded into a YCrCb colour space (see section 1.2.1), so we need an encoder to modulate NTSC, PAL or SECAM.

    So the encoder system encodes digital YCrCb video data to an NTSC, PAL-CVBS or S-Video signal and also RGB. The system has two separate digital video input channels Digital video data input on the first channel is encoded into RGB format while that on the second channel is encoded into PAL/NTSC-CVBS and S-Video formats.

    Table 3- Different TV formats and characteristics
    NTSC-M 525/60 3.57954545 858
    PAL-M 525/60 3.57561149 864
    PAL-COMBINATION-N* 625/50 3.58205625 864
    PAL-BDGHI 625/50 4.43361875 864
    SECAM 625/50 4,250 and 4,406250 864

    In the oldest PAL, NTSC and SECAM standards, some lines, which are not part of the active video, are left at blanking levels; so further modifications of the standard allow you to add information instead. This allows you to include teletext, widescreen and so on. See, and 2.1.3. You can see in Table 4 and Table 5 , or in Figure 4 what is in these lines.

    Table 4- NTSC lines
    Lines Length Output from encoder
    1-3 3 Equalization pulses
    4-6 3 Broad vertical pulse
    7-9 3 Equalization pulses
    10-20 11 Normally at blanking level1
    21 1 Field I closed-caption data or blanking level1
    22 1 Blanking level1
    23-262 240 Active video2
    263 1 Hsync, burst, and equalization pulse
    264-265 2 Equalization pulses
    266 1 Half line equalization pulse, half line broad vertical pulse
    267-268 2 Broad vertical pulse
    269 1 Half line broad vertical pulse, half line equalization pulse
    270-271 2 Equalization pulses
    272 1 Half line equalization pulse, half line blanking level
    273-283 11 Blanking level1
    284 1 Field II closed-caption data or blanking level*
    285 1 Blanking level1
    286-525 240 Active video2

    Table 5- PAL lines
    Lines Length Output from encoder
    1-2 2 Broad vertical pulse
    3 1 Half line broad vertical pulse, half line equalization pulse
    4-5 2 Equalization pulses
    6 1 Blanking level1; burst in odd frames only
    7-21 15 Blanking level1 (or teletext)
    22 1 Field I closed-caption data or blanking level1
    23 1 Blanking level2 (or widescreen)
    24-309 286 Active video3
    310 1 Active video3; burst in odd frames only
    311-312 2 Equalization pulses
    313 1 Half line equalization pulse, half line broad vertical pulse
    314-315 2 Broad vertical pulse
    316-317 2 Equalization pulses
    318 1 Half line equalization pulse, half line blanking level
    319 1 Blanking level1; burst in even frames only
    320-334 15 Blanking level1 (or teletext)
    335 1 Field II closed-caption data or blanking level1
    336-621 286 Active video3
    622 1 Active video3; burst in odd frames only
    623 1 Half line blanking level2, half line equalization pulse
    624-625 2 Equalization pulses


    Table 6- Different widescreen aspect ratio
      Number Ratio Aspect ratio Format Position
    WIDESCREEN_NO = 0x8 1 000 4:3 full N/A
    WIDESCREEN_001= 0x1 0 001 14:9 letterbox  centre
    WIDESCREEN_010= 0x2 0 010 14:9 letterbox  top
    WIDESCREEN_011= 0xB 1 011 16:9 letterbox  centre
    WIDESCREEN_100= 0x4 0 100 16:9 letterbox  top
    WIDESCREEN_101= 0xD 1 101 >16:9 letterbox  centre
    WIDESCREEN_110= 0xE 1 110 14:9 full centre
    WIDESCREEN_111= 0x7 0 111 16:9 full N/A 


        1. Teletext



          This part aims at describing what are the requirements for teletext. Some of the registers concerned cannot be changed at any time, because the teletext information is given during blanking periods (VBI) and if you change the parameters during this time you will have troubles. The registers concerned with that are the registers to enable it and to say which are the begin and end lines in odd and even fields. These parameters may change during the sequence. So we need to be able to write them at each image displayed, but not during the VBI interval.

          As a reference, we have been using the [10] made by the European Broadcasting Union in October 96. Teletext data are conveyed in Packetized Elementary Streams (PES). Each PES data field has several data_field inside, and each data_field has the line_offset in which the field shall be displayed (from 7 to 22 for field 1 and 320 to 335 for field 2). The toggling of the parity_field for each data_field indicates a new field.

        3. Widescreen: general information
         According to [3], there is 14 bits to set for configuring widescreen:

        Table 7 - Meaning of 14 widescreen bits
        bits Description
        3..0 Aspect ratio label, letterbox and position code.
        4 Camera mode or film mode.
        7..5 reserved, set to 0.
        8 subtitles within teletext bit
        10..9 mode of subtitle
        13..11 reserved, set to 0


        Aspect ratio

        Table 8 -Aspect ratio for widescreen
        Number Ratio Aspect ratio Format Position
        1 000 4:3 full N/A
        0 001 14:9 letterbox  centre
        0 010 14:9 letterbox  top
        1 011 16:9 letterbox  centre
        0 100 16:9 letterbox  top
        1 101 >16:9 letterbox  centre
        1 110 14:9 full centre
        0 111t> 16:9 full N/A 

        The values of bits 3 to 0 are rather complex and may depend on the MPEG sequences. The basic idea is that a 16:9 widescreen TV ( as opposed to an older one 4:3) needs to know sometimes some extra information: if it receives for example an image with 16:9 ratio inside a 4:3 broadcast frame (see [13] or [3]), you will have black bands of bottom and top, like in figure below What happens is that in television broadcasting, you may for example convey. Therefore it’s useful to know that for the receiver: if the receiver is 4:3, it will display it with black horizontal bands; if it is a 16:9 receiver, it will adapt the image thanks to the widescreen information.

        Note that some modern 4:3 televisions are able to decode the widescreen information, too.

        In MPEG sequence, we will have to take into account two parameters: the full screen size and the active region, which can be different; so the widescreen will depend on this information.

        This is an only the simplest case that we showed here. See [2], [3], [7], and [13] for complete explanation.

        Camera mode/film mode

        The film mode should not be very interesting as far digital television is concerned. In this mode (contrary to the default one), the 2 fields of one frame come from the same image; in television, the 2 fields of one frame have been taken at different times, and therefore are independent of each other.


        There is a possibility of having subtitles; you can choose whether to put into active image area or out (i.e. in this case in the black bottom band). See [2], [3].


      1. A more complete approach on Teletext



        This part deals with teletext.

        1. The encoder

        2. The encoder sends at each frame (every 20 ms in PAL) a certain number of lines, determined for each field in the frame TTX data are required for both fields.

          So if we look more precisely for one line:

          And for a frame:


Teletext: understanding the MPEG2 standard

The PES teletext data field has this syntax:
  NB of bits
PES_data_field(){ 8
data_identifier= 0x10, EBU  
data_unit_id: can be 0x02(EBU teletext), 0x03 (EBU subtitle) and 0xFF (stuffing). 8
data_unit_length 8
data_field() 352 (for EBU)

N is determined by the PES_packet_length (equal to N*184-6).

I do not think that we should consider another value for the data_identifier instead of 0x10.

Each data_field is like that: (there is one data_field for each line sent to the encoder)
Description Number of bits
reserved_future_use 2
field_parity 1
line_offset (0x00, or 0x07 to 0x16). 5
framing_code  8
magazine_and_packet_address 16
data_block 320

We do not care about the framing_code, magazine_and_packet_address and data_block; they are just sent to the encoder.

Line offset

For the data_field with a data_unit_id of 0x02 (EBU Teletext for non-subtitle data), we have a line_offset between 0x7 and 0x16. This offset must be indicated to the encoder, which needs to know the start and end line. It means that we will need to analyse a little the data. This corresponds to the line during VBI blanking into which the information is sent (so this is not surprising to find such a value).


There can be subtitle if there is a data_unit_id of 0x03. I think that we should send it to the encoder. However, the data_unit_id is not transmitted to the encoder, so I wonder how the encoder makes the difference between subtitle and no subtitle. I wonder too if the 64108 makes any difference.

Stuffing data

There is also the case of stuffing data. We think that we should get rid of it.

These stuffing data have a data_unit_id of 0xFF. We need to know that the 64108 does with that one.

Data sent to the encoder

Basically, you have in the MPEG sequence for one line:
  bytes Destination
data_unit_id: 1 skipped
data_unit_length 1 skipped
reserved_future_use/ field_parity/ line_offset (0x00, or 0x07 to 0x16). 1 skipped
framing_code  1 sent to encoder
magazine_and_packet_address 2 sent to encoder
data_block 40 sent to encoder

According to the [11], page 18, for each teletext packet, there is 45 bytes sent to the television, dividing into:
  Bytes Destination
 clock run in, used for synchronisation, (1010101010101010, and 0xAAAA). 2 added by 64108
framing_code  1 comes from MPEG
magazine_and_packet_address 2 comes from MPEG
data_block 40 comes from MPEG



LSI Logic documentation

[1]L64208 Encoder User Specification, LSI Logic, June 2, 1998

ITU documentation

[2] ITU-R Recommendations, BT-R 1118, Enhanced compatible widescreen television based on conventional television systems.

[3] ITU-R Recommendations Wide Screen Signalling (WSS) encoding, BT-R 1119.

[4] ITU-R BT.653-2, on Teletext system-B encoding

Philips documentation

[5] [Philips datasheet for SAA7182A, Digital video encoder, 1996 Sep 11, http://www-semiconductors.philips.com/acrobat/datasheets/SAA7182A_83A_2.pdf

[6] Philips Application Note, AN96055, programming tables for SAA7111, SAA7182/83 and SAA7184/85B/88A.

General explaining documentation

[7] Video demystified, Second Edition, Keith Jack, HighText, see http://www.video-demystified.com

[>[8] La télévision numérique: MPEG-1, MPEG-2 et système européen DVB Application, 2ème édition, Hervé Benoit, Dunod.

European Telecommunication Standard (ETS) documentation

[9] ETS 300 468, on Service Information in DVB, from European Telecommunication Standard.

[10] ETS 300 472, Specification for conveying ITU-R System B Teletext in DVB bitstreams, from European Telecommunication Standard, October 1996.

[11] ETS 300 706, on Enhanced Teletext Specification, from European Telecommunication Standard.


[13] Digital Terrestrial Television, Requirements for interoperability, DTG, June 1997

[14] EIA-608-1994 - Closed Caption encoding