A short introduction to MPEG 2

and digital television

1. Overview of digital television * 1.1. MPEG 2 compression *
1.2. Basis of Digital Video *
1.2.1. The pictures *
1.2.2. Luminance and chrominance *
1.2.3. Video sequence *
1.2.4. Interlacing *
1.2.5. Standard TV formats *
1.2.6. Compression in MPEG 2 *
1.3. Set top box. * 2. Miscellaneous * 2.1.1.1. Teletext *
2.1.1.2. Widescreen: general information * 2.1.2. Summary of design manual *
2.1.3. A more complete approach on Teletext * 2.1.3.1. The encoder *
2.1.3.3. Issues * Bibliography *

Overview of digital television

MPEG 2 compression

In every numerical television broadcast standard, the MPEG 2 norm is used, at least for video. In audio, things are not that simple.

Standard ATSC DVB-S/C/T

Countries USA Europe

Typical use Terrestrial, cable Satellite, cable, terrestrial

Video coding MPEG2 Video MPEG2 Video

Audio coding AC3 MPEG2 Audio

Multiplex ATSC MPEG2 System

Modulation VSB QPSK/QAM/COFDM

Historical perspective

In the eighties, it was generally assumed that numerical broadcasting of television would not happen before the end of this century; because the bandwidth needed was very important (108 to 270 Mb/s for a display of 525 or 625 lines). Besides, Japan, Europe and USA thought that customers would need a higher quality for the image; so the standards IDTV and HTDV (1050 to 1250 lines) were developed. They would have required 1 Gbit/s. So the High Definition Systems proposed were analogue (MUSE in Japan, HD-MAC in Europe, and some propositions in USA) with a digital help.

At the end of the eighties, many things changed:

The video compressions were made more and more efficient (JPEG, MPEG), so the needed rate was only between 1.5 and 30 Mb/s for animated movies.

This became possible to produce cheap integrated circuits to decode these images.
The cost of a high definition television would remain too high for the average customer.
Customers were asking for more quality and quantity in programs rather than quality in image.

So the first numeric broadcastings began in 1994 in States (DirecTV project). In Europe, the ELG (European Launching Group) gave birth to the DVB project using MPEG2. Canal + first and many others began to broadcast numerically.

The MPEG 2 Norm allows compressing video signal with very low quality loss. For example, for standard television, a signal from 124 Mbits/s is compressed to between 4 and 8 Mbits/s.

The MPEG transport stream

Contrary to the MPEG 1 and MPEG 2 program streams, which are dedicated to error free medium (CD-ROM, hard disks), the MPEG transport streams is dedicated to error prone mediums (television). So the length of the transport packets is fixed to 188 bytes. A transport packet divides into:
description packet

An MPEG transport stream is divided into several PES (packet elementary streams), which are divided into transport packets. Each PES is identified by its PID (Packet IDentifier). We will not describe here the different tables and information in MPEG norm, as it would be too complex and not as good as that you can find elsewhere. See [7] and [8].

Basis of Digital Video

The pictures

A picture is a bidimensionnal array of pixels. Each pixel is characterised by its colour. In the computer world, the colour is always stored as a three-element vector, which gives the pixel intensity for each primary colour red, green and blue (R, G and B components). The intensity of each component is usually quantified on 8 bits in consumer products.

Luminance and chrominance

In the video world, the colours are not stored in the same way in order to save some bandwidth. Our eyes are less sensitive to the actual colour information than to the luminance intensity, which characterises the brightness of pixels. So a possible way to compress the video signal is to allocate more bandwidth (i.e. spatial resolution) to the luminance information and less bandwidth to the chrominance information. The luminance Y and the chrominance values C_band C_{r are} deduced from the RGB values by a lineal translation defined by the 3x3 matrix M_:

C_b(respectively C_r)
is usually the difference between the blue (respectively red) component and Y.

The MPEG2 norm also uses the YUV components. Here are the formulas:

Y= 0.587V+0.299R+0.114B

C_b=0,564(B-Y) or U=0.493(B-Y)

C_r=0.713(R-Y)or V=0.877(R-Y)

Table 1: Chrominance subsampling formats

Format name Horizontal subsampling Vertical subsampling Compression ratio Application

4:4:4 1 1 1 High quality video

4:2:2 2 1 1.5 Standard TV format

4:2:0 2 2 2 Standard MPEG format

Video sequence

A video sequence is simply a succession of pictures. Each picture is shown during a fixed amount of time on the screen. In order to avoid flickering, the frequency at which the pictures are shown must be high enough (typically 50 Hz for NTSC and 60 Hz for PAL and SECAM). The cinema picture of 24 Hz is clearly not enough to reconstitute a smooth motion.

Interlacing

In analogue television, interlacing is used to save some bandwidth. It consists in transmitting (and displaying) the even lines then the odd lines of each picture. So the used bandwidth is divided by two.

According to the MPEG vocabulary, a field is a picture where the vertical resolution has been divided by two. The top field contains the even lines (the first line being the line 0) and the bottom field contains the odd lines. A frame consists of a top field and a bottom field. Note that a frame is not necessary a coherent picture because the top and bottom fields are usually sampled at a different time.

The word picture refers to either a field or a frame, depending on the context. In MPEG2, a picture corresponds to the compressed data associated with a picture header. It can be a field or a frame. An interlaced sequence is a sequence, which is field-based; i.e. each picture displayed is a field. A progressive sequence is a sequence where each picture displayed is a frame.

Standard TV formats

Table 2 - Standard TV formats

Standard Horizontal resolution Vertical resolution Colour format Interlaced Frame rate (Hz)

CCIR601-50 (PAL/SECAM) 720 576 4:2:2 Yes 25

CCIR601-50 (NTSC) 720 480 4:2:2 Yes 30

Compression in MPEG 2

MPEG is a compression standard for digital video, suitable for television broadcasting. It handles a wide range of video formats, interlaced or progressive, with different chrominance formats. Its main advantage is that the receiver has much less to do than the transmitter. It uses two compression methods:

Motion compensation: a picture is divided in blocks of 1-x16 pixels called macroblocks. A macroblock is predicted from at most two anchor frames already decoded. Motion vectors gives the co-ordinates of the prediction in the two anchor frames. The two predictions are averaged to form the predicted macroblock.
Transform coding: The difference between the original picture and the prediction is transformed with an orthogonal transform called the Discrete Cosine Transform (DCT). It is essentially a representation in the frequency domain of the picture. Since our eyes are less sensitive to the high frequencies, the high frequency coefficients are quantified with less accuracy than the low frequency ones. The DCT is applied to 8x8 blocks.

These are three kinds of pictures:

The I (Intra): they do not depend on other pictures, and no prediction is made; only DCT compression is used. Such one of these I pictures must be displayed at least twice per second; if not; any error in a picture would be repeated during a long period of time.
The P (predictive-coded): predicted only from the previous P or I pictures (also called the anchors).
The B (bidirectionnally predicted): these are predicted from the previous and the next P or I pictures; so they are the most compressed. Since the B picture depends on the next anchor; the order in which the pictures are transmitted is not the same as the order in which they are actually displayed. Example:

I₀

P₃

B₁

B₂

P₆

B₄

P₅

This would be displayed in that order:

I₀ B₁ B₂ P₃ B₄ P₅ P₆

Set top box.

A set top box is, in its more general sense, a peripheral for a television; in this document a set top box will mean a decoder able to receive MPEG2 sequences via terrestrial, cable or satellite, and to output NTSC, PAL or SECAM signals suitable for any classic analogue television.

Inside the set top box, you generally find (here we take the example of the LSI Logic SDP-1000):

Some receivers for cable, or satellite.
The transport chip (here the L64008) which combines a 32 PID transport demultiplexer, a 32 bit RISC CPU and a DVB descrambler. This also includes an IEEE1284 parallel, Philips I²C bus, UARTs and teletext interface.
the audio/video decoder (here the L64005) which allows the display of PAL, NTSC or SECAM using only 16 Mbits of DRAM.
Other useful peripherals: Smart Card, modem connection, Ethernet, IEEE1394 firewire, and so on.

Basically, the transport chip parses the MPEG transport stream, sends the audio and video PES to the A/V decoder, and deals with the other PES too (for example, teletext PES).

On such boards, you have many constraints: everything must be affordable for the customer. So it is like a small PC except that you do not have much memory, an hard disk, a very fast CPU and so on.

An MPEG sequence is usually decoded into a YC_rC_b colour space (see section 1.2.1), so we need an encoder to modulate NTSC, PAL or SECAM.

So the encoder system encodes digital YC_rC_b video data to an NTSC, PAL-CVBS or S-Video signal and also RGB. The system has two separate digital video input channels Digital video data input on the first channel is encoded into RGB format while that on the second channel is encoded into PAL/NTSC-CVBS and S-Video formats.

Table 3- Different TV formats and characteristics

FORMAT LINE/FIELD BURST FREQ (MHz) PIXELS

NTSC-M 525/60 3.57954545 858

PAL-M 525/60 3.57561149 864

PAL-COMBINATION-N* 625/50 3.58205625 864

PAL-BDGHI 625/50 4.43361875 864

SECAM 625/50 4,250 and 4,406250 864

In the oldest PAL, NTSC and SECAM standards, some lines, which are not part of the active video, are left at blanking levels; so further modifications of the standard allow you to add information instead. This allows you to include teletext, widescreen and so on. See 2.1.1.1, 2.1.1.2 and 2.1.3. You can see in Table 4 and Table 5 , or in Figure 4 what is in these lines.

Table 4- NTSC lines

Lines Length Output from encoder

1-3 3 Equalization pulses

4-6 3 Broad vertical pulse

7-9 3 Equalization pulses

10-20 11 Normally at blanking level1

21 1 Field I closed-caption data or blanking level1

22 1 Blanking level1

23-262 240 Active video2

263 1 Hsync, burst, and equalization pulse

264-265 2 Equalization pulses

266 1 Half line equalization pulse, half line broad vertical pulse

267-268 2 Broad vertical pulse

269 1 Half line broad vertical pulse, half line equalization pulse

270-271 2 Equalization pulses

272 1 Half line equalization pulse, half line blanking level

273-283 11 Blanking level1

284 1 Field II closed-caption data or blanking level*

285 1 Blanking level1

286-525 240 Active video2

Table 5- PAL lines

Lines Length Output from encoder

1-2 2 Broad vertical pulse

3 1 Half line broad vertical pulse, half line equalization pulse

4-5 2 Equalization pulses

6 1 Blanking level1; burst in odd frames only

7-21 15 Blanking level1 (or teletext)

22 1 Field I closed-caption data or blanking level1

23 1 Blanking level2 (or widescreen)

24-309 286 Active video3

310 1 Active video3; burst in odd frames only

311-312 2 Equalization pulses

313 1 Half line equalization pulse, half line broad vertical pulse

314-315 2 Broad vertical pulse

316-317 2 Equalization pulses

318 1 Half line equalization pulse, half line blanking level

319 1 Blanking level1; burst in even frames only

320-334 15 Blanking level1 (or teletext)

335 1 Field II closed-caption data or blanking level1

336-621 286 Active video3

622 1 Active video3; burst in odd frames only

623 1 Half line blanking level2, half line equalization pulse

624-625 2 Equalization pulses

Table 6- Different widescreen aspect ratio

Number Ratio Aspect ratio Format Position

WIDESCREEN_NO = 0x8 1 000 4:3 full N/A

WIDESCREEN_001= 0x1 0 001 14:9 letterbox centre

WIDESCREEN_010= 0x2 0 010 14:9 letterbox top

WIDESCREEN_011= 0xB 1 011 16:9 letterbox centre

WIDESCREEN_100= 0x4 0 100 16:9 letterbox top

WIDESCREEN_101= 0xD 1 101 >16:9 letterbox centre

WIDESCREEN_110= 0xE 1 110 14:9 full centre

WIDESCREEN_111= 0x7 0 111 16:9 full N/A

Teletext

This part aims at describing what are the requirements for teletext. Some of the registers concerned cannot be changed at any time, because the teletext information is given during blanking periods (VBI) and if you change the parameters during this time you will have troubles. The registers concerned with that are the registers to enable it and to say which are the begin and end lines in odd and even fields. These parameters may change during the sequence. So we need to be able to write them at each image displayed, but not during the VBI interval.

As a reference, we have been using the [10] made by the European Broadcasting Union in October 96. Teletext data are conveyed in Packetized Elementary Streams (PES). Each PES data field has several data_field inside, and each data_field has the line_offset in which the field shall be displayed (from 7 to 22 for field 1 and 320 to 335 for field 2). The toggling of the parity_field for each data_field indicates a new field.

Widescreen: general information

According to [3], there is 14 bits to set for configuring widescreen:

Table 7 - Meaning of 14 widescreen bits

bits Description

3..0 Aspect ratio label, letterbox and position code.

4 Camera mode or film mode.

7..5 reserved, set to 0.

8 subtitles within teletext bit

10..9 mode of subtitle

13..11 reserved, set to 0

Aspect ratio

Table 8 -Aspect ratio for widescreen

Number Ratio Aspect ratio Format Position

1 000 4:3 full N/A

0 001 14:9 letterbox centre

0 010 14:9 letterbox top

1 011 16:9 letterbox centre

0 100 16:9 letterbox top

1 101 >16:9 letterbox centre

1 110 14:9 full centre

0 111t> 16:9 full N/A

The values of bits 3 to 0 are rather complex and may depend on the MPEG sequences. The basic idea is that a 16:9 widescreen TV ( as opposed to an older one 4:3) needs to know sometimes some extra information: if it receives for example an image with 16:9 ratio inside a 4:3 broadcast frame (see [13] or [3]), you will have black bands of bottom and top, like in figure below What happens is that in television broadcasting, you may for example convey. Therefore it’s useful to know that for the receiver: if the receiver is 4:3, it will display it with black horizontal bands; if it is a 16:9 receiver, it will adapt the image thanks to the widescreen information.

Note that some modern 4:3 televisions are able to decode the widescreen information, too.

In MPEG sequence, we will have to take into account two parameters: the full screen size and the active region, which can be different; so the widescreen will depend on this information.

This is an only the simplest case that we showed here. See [2], [3], [7], and [13] for complete explanation.

Camera mode/film mode

The film mode should not be very interesting as far digital television is concerned. In this mode (contrary to the default one), the 2 fields of one frame come from the same image; in television, the 2 fields of one frame have been taken at different times, and therefore are independent of each other.

Subtitles

There is a possibility of having subtitles; you can choose whether to put into active image area or out (i.e. in this case in the black bottom band). See [2], [3].

A more complete approach on Teletext

This part deals with teletext.

The encoder

The encoder sends at each frame (every 20 ms in PAL) a certain number of lines, determined for each field in the frame TTX data are required for both fields.

So if we look more precisely for one line:

And for a frame:

Teletext: understanding the MPEG2 standard

The PES teletext data field has this syntax:

NB of bits

PES_data_field(){ 8

data_identifier= 0x10, EBU

for(i=0;i<N;i++){

data_unit_id: can be 0x02(EBU teletext), 0x03 (EBU subtitle) and 0xFF (stuffing). 8

data_unit_length 8

data_field() 352 (for EBU)

N is determined by the PES_packet_length (equal to N*184-6).

I do not think that we should consider another value for the data_identifier instead of 0x10.

Each data_field is like that: (there is one data_field for each line sent to the encoder)

Description Number of bits

data_field(){

reserved_future_use 2

field_parity 1

line_offset (0x00, or 0x07 to 0x16). 5

framing_code 8

magazine_and_packet_address 16

data_block 320

We do not care about the framing_code, magazine_and_packet_address and data_block; they are just sent to the encoder.

Line offset

For the data_field with a data_unit_id of 0x02 (EBU Teletext for non-subtitle data), we have a line_offset between 0x7 and 0x16. This offset must be indicated to the encoder, which needs to know the start and end line. It means that we will need to analyse a little the data. This corresponds to the line during VBI blanking into which the information is sent (so this is not surprising to find such a value).

Subtitles

There can be subtitle if there is a data_unit_id of 0x03. I think that we should send it to the encoder. However, the data_unit_id is not transmitted to the encoder, so I wonder how the encoder makes the difference between subtitle and no subtitle. I wonder too if the 64108 makes any difference.

Stuffing data

There is also the case of stuffing data. We think that we should get rid of it.

These stuffing data have a data_unit_id of 0xFF. We need to know that the 64108 does with that one.

Data sent to the encoder

Basically, you have in the MPEG sequence for one line:

bytes Destination

data_unit_id: 1 skipped

data_unit_length 1 skipped

reserved_future_use/ field_parity/ line_offset (0x00, or 0x07 to 0x16). 1 skipped

framing_code 1 sent to encoder

magazine_and_packet_address 2 sent to encoder

data_block 40 sent to encoder

According to the [11], page 18, for each teletext packet, there is 45 bytes sent to the television, dividing into:

Bytes Destination

clock run in, used for synchronisation, (1010101010101010, and 0xAAAA). 2 added by 64108

framing_code 1 comes from MPEG

magazine_and_packet_address 2 comes from MPEG

data_block 40 comes from MPEG

Bibliography

LSI Logic documentation

[1]L64208 Encoder User Specification, LSI Logic, June 2, 1998

ITU documentation

[2] ITU-R Recommendations, BT-R 1118, Enhanced compatible widescreen television based on conventional television systems.

[3] ITU-R Recommendations Wide Screen Signalling (WSS) encoding, BT-R 1119.

[4] ITU-R BT.653-2, on Teletext system-B encoding

Philips documentation

[5] [Philips datasheet for SAA7182A, Digital video encoder, 1996 Sep 11, http://www-semiconductors.philips.com/acrobat/datasheets/SAA7182A_83A_2.pdf

[6] Philips Application Note, AN96055, programming tables for SAA7111, SAA7182/83 and SAA7184/85B/88A.

General explaining documentation

[7] Video demystified, Second Edition, Keith Jack, HighText, see http://www.video-demystified.com

[>[8] La télévision numérique: MPEG-1, MPEG-2 et système européen DVB Application, 2^ème édition, Hervé Benoit, Dunod.

European Telecommunication Standard (ETS) documentation

[9] ETS 300 468, on Service Information in DVB, from European Telecommunication Standard.

[10] ETS 300 472, Specification for conveying ITU-R System B Teletext in DVB bitstreams, from European Telecommunication Standard, October 1996.

[11] ETS 300 706, on Enhanced Teletext Specification, from European Telecommunication Standard.

Others

[13] Digital Terrestrial Television, Requirements for interoperability, DTG, June 1997

[14] EIA-608-1994 - Closed Caption encoding

Standard	ATSC	DVB-S/C/T
Countries	USA	Europe
Typical use	Terrestrial, cable	Satellite, cable, terrestrial
Video coding	MPEG2 Video	MPEG2 Video
Audio coding	AC3	MPEG2 Audio
Multiplex	ATSC	MPEG2 System
Modulation	VSB	QPSK/QAM/COFDM

Format name	Horizontal subsampling	Vertical subsampling	Compression ratio	Application
4:4:4	1	1	1	High quality video
4:2:2	2	1	1.5	Standard TV format
4:2:0	2	2	2	Standard MPEG format

Standard	Horizontal resolution	Vertical resolution	Colour format	Interlaced	Frame rate (Hz)
CCIR601-50 (PAL/SECAM)	720	576	4:2:2	Yes	25
CCIR601-50 (NTSC)	720	480	4:2:2	Yes	30

FORMAT	LINE/FIELD	BURST FREQ (MHz)	PIXELS
NTSC-M	525/60	3.57954545	858
PAL-M	525/60	3.57561149	864
PAL-COMBINATION-N*	625/50	3.58205625	864
PAL-BDGHI	625/50	4.43361875	864
SECAM	625/50	4,250 and 4,406250	864

Lines	Length	Output from encoder
1-3	3	Equalization pulses
4-6	3	Broad vertical pulse
7-9	3	Equalization pulses
10-20	11	Normally at blanking level1
21	1	Field I closed-caption data or blanking level1
22	1	Blanking level1
23-262	240	Active video2
263	1	Hsync, burst, and equalization pulse
264-265	2	Equalization pulses
266	1	Half line equalization pulse, half line broad vertical pulse
267-268	2	Broad vertical pulse
269	1	Half line broad vertical pulse, half line equalization pulse
270-271	2	Equalization pulses
272	1	Half line equalization pulse, half line blanking level
273-283	11	Blanking level1
284	1	Field II closed-caption data or blanking level*
285	1	Blanking level1
286-525	240	Active video2

	Number Ratio	Aspect ratio	Format	Position
WIDESCREEN_NO = 0x8	1 000	4:3	full	N/A
WIDESCREEN_001= 0x1	0 001	14:9	letterbox	centre
WIDESCREEN_010= 0x2	0 010	14:9	letterbox	top
WIDESCREEN_011= 0xB	1 011	16:9	letterbox	centre
WIDESCREEN_100= 0x4	0 100	16:9	letterbox	top
WIDESCREEN_101= 0xD	1 101	>16:9	letterbox	centre
WIDESCREEN_110= 0xE	1 110	14:9	full	centre
WIDESCREEN_111= 0x7	0 111	16:9	full	N/A

bits	Description
3..0	Aspect ratio label, letterbox and position code.
4	Camera mode or film mode.
7..5	reserved, set to 0.
8	subtitles within teletext bit
10..9	mode of subtitle
13..11	reserved, set to 0

	NB of bits
PES_data_field(){	8
data_identifier= 0x10, EBU
for(i=0;i<N;i++){
data_unit_id: can be 0x02(EBU teletext), 0x03 (EBU subtitle) and 0xFF (stuffing).	8
data_unit_length	8
data_field()	352 (for EBU)

Description	Number of bits
data_field(){
reserved_future_use	2
field_parity	1
line_offset (0x00, or 0x07 to 0x16).	5
framing_code	8
magazine_and_packet_address	16
data_block	320

	bytes	Destination
data_unit_id:	1	skipped
data_unit_length	1	skipped
reserved_future_use/ field_parity/ line_offset (0x00, or 0x07 to 0x16).	1	skipped
framing_code	1	sent to encoder
magazine_and_packet_address	2	sent to encoder
data_block	40	sent to encoder

	Bytes	Destination
clock run in, used for synchronisation, (1010101010101010, and 0xAAAA).	2	added by 64108
framing_code	1	comes from MPEG
magazine_and_packet_address	2	comes from MPEG
data_block	40	comes from MPEG