Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio...

14
VERSION 1.07 Classified December 2011 Technical Specifications for High-Definition Programmes TV serials

Transcript of Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio...

Page 1: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

VERSION 1.07 Classified December 2011

Technical Specifications

for High-Definition

Programmes

TV serials

Page 2: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

2

DDooccuummeenntt hhiissttoorryy

Title Specifications for purchasing high-definition programmes

Subject TV serials

Classification Classified

Issuing body/Structure Rai Radiotelevisione Italiana S.p.A.

Contacts

Review Date Notes

0 02/03/09

0.1

0.2

0.3

0.4 13/03/09 Dolby E section removed

0.5 19/03/09 Loudness measure included; subtitles regulation indicated (description to be included); metadata delivery indicated in XLS format, too – to be specified at a following release; reference to cinema genre eliminated

0.6 23/03/09 Loudness paragraph changes; glossary terms specified

0.7 27/03/09 Glossary extended

0.8 31/03/09 Introduction of EBU Recommendation R95 replacing SMPTE RP 128. Change in paragraph 5.

0.9 03/04/09 Delivery and production format correction

0.95 27/04/09 Paragraphs 2.4 and 5 review

0.98 27/04/09 Paragraph 5 layout and review

1.01 28/05/09 Paragraph 4.8 review

1.02 03/06/2009 Paragraph 4.11 – Target Loudness Level adjustment

1.03 15/06/09 Paragraph 3 review

1.04 29/07/2009 Paragraphs 2.7, 4.8 and 4.11 review (Rai Fiction remarks)

1.05 29/09/2011 Paragraphs 3, 3.2, 4.8, 4.10, 4.11 and 6 review

1.06 15/12/2011 Paragraphs 4.9

1.07 21/12/2011 Paragraphs 4.9 and editorial formatting

Page 3: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

3

TTaabbllee ooff ccoonntteennttss

1 PREFACE .................................................................................................. 4

2 PRODUCTION ........................................................................................... 4

2.1 PRODUCTION VIDEO FORMATS ................................................................ 4

2.2 MOVING CAPTIONS ................................................................................ 4

2.3 ANTI-PSE ............................................................................................ 5

2.4 SAFE AREA ........................................................................................... 5

2.5 PRESENCE OF CONTENT WITH DIFFERENT QUALITIES AND FORMATS .......... 5

2.6 CONVERSIONS FROM OTHER FORMATS .................................................... 6

2.7 AUDIO CHARACTERISTICS ...................................................................... 6

3 DELIVERY FORMAT ................................................................................. 7

3.1 LONG PROGRAMMES ............................................................................. 7

3.2 ACCOMPANYING PAPERWORK AND LABELLING ........................................ 7

3.3 METADATA ........................................................................................... 8

3.4 SUBTITLES ........................................................................................... 8

4 PACKAGE OF RECORDED MATERIAL .................................................. 8

4.1 PROTECTION SEQUENCE ........................................................................ 8

4.2 ALIGNMENT SEQUENCE .......................................................................... 8

4.3 IDENTIFICATION SEQUENCE .................................................................... 9

4.4 START SEQUENCE ................................................................................. 9

4.5 PROGRAMME ........................................................................................ 9

4.6 END SEQUENCE ..................................................................................... 9

4.7 TIMECODE .......................................................................................... 10

4.8 AUDIO TRACK ASSIGNMENT .................................................................. 10

4.9 AUDIO LEVEL AND CHANNEL IDENTIFICATION ......................................... 11

4.10 MAXIMUM AUDIO LEVEL ...................................................................... 12

4.11 PROGRAMME LOUDNESS .................................................................... 12

4.12 LIPSYNC ........................................................................................... 12

5 QUALITY REQUIREMENTS ................................................................... 12

6 SUMMARY OF TECHNICAL REGULATIONS AND RECOMMENDATIONS................................................................................. 13

7 GLOSSARY ............................................................................................. 14

Page 4: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

4

11 PPrreeffaaccee This document describes the guidelines for the supply of high-definition content (HDTV) in relation to the serial genre.

The technical aspects completing what has been indicated in the international regulations about HD video signals are illustrated below.

All technical parameters defined in the international regulations shall nevertheless be followed.

For the sake of convenience, paragraph 6 includes references to the main regulations and recommendations.

22 PPrroodduuccttiioonn HD content means content produced, post-produced and distributed in one of the formats currently identified as ―high definition‖.

2.1 Production video formats

The HD production chain shall respect EBU recommendation R124 for choice of compression algorithms and bitrates for the acquisition, production and distribution of HD content.

Rai accepts HD content produced in the format:

1920x1080 25p as preferential

In agreement with editorial needs, special productions can also be made in the formats:

1920x1080 25i

1920x1080 50p

1280x720 50p

Or with higher resolution (4K).

Video signals 1920x1080 shall comply with SMPTE 274M standard and Recommendation ITU-R BT.709.5.

Video signals in the 1280x720 format shall comply with SMPTE 296 M standards.

The colorimetric reference for each format is the one specified in Recommendation ITU-R BT.709.5.

If the acquisition is made through film, use 35 mm o higher medium with aspect ratio preferably equalling 1.78 (16:9).

However, at any point in the chain, the content shall not be encoded with horizontal or vertical resolution lower than the one required by Rai, except for provisions in paragraph 2.5.

2.2 Moving Captions

Moving captions shall be included by working on the native format, making sure they are still readable after any following conversions in interlaced formats.

Page 5: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

5

2.3 Anti-PSE

Flashes, intermittent lights and a few types of repetitive visual schemes can cause problems to viewers suffering from photosensitive epilepsy – PSE. For its own nature, television is an intermittent source of light, and thus it is not possible to completely eliminate the risk of causing convulsions in subjects suffering from such kind of epilepsy; however, a few cautionary measures can be taken to reduce this risk, above all, when it is gratuitous and unnecessary. For a few fundamental guidelines on this problem, please see the website of the Independent Television Commission: www.ofcom.org.uk.

2.4 Safe area

The safe area for high-definition formats must be the one described in EBU Recommendation R95.

The programmes produced in high definition will be broadcast on standard definition platforms in format 16:9 or 4:3 letter box; therefore, any 16:9 protected shooting is not necessary.

The EBU Recommendation provide for:

The main action shall be included in the central area up to 90% of height and 93% of width of HD image;

Captions and graphics shall be included in the central area up to 90% of the height and 80% of the width of the HD image.

2.5 Presence of content with different qualities and formats

The high-definition product can include a portion of original content in standard definition, if justified by editorial requirements; that portion must have been upconverted to high definition through state-of-the-art technology. In the same way, for specific editorial requirements, content encoded through non-professional codecs can be present. In case of 16:9 native format content, the upconversion operation shall neither introduce any aspect changes, nor alter geometric proportions.

Any 4:3 format contents shall be converted to 16:9 using methods compliant with editorial requirements, without altering geometric proportions, paying attention to keep the main elements of the 4:3 native contents (graphics, action, etc.)

Page 6: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

6

2.6 Conversions from other formats

In case of production made in cinematographic aspect ratios (1.66:1, 1.85:1, 2.35:1), the aspect ratio conversion will have to be made anyway, as e.g. 4:3, making sure the main elements of the native contents are kept, and without altering geometric proportions. The following example of conversion to 16:9 is just addition of black bars without eliminating any parts of the original video image:

1,85:11,85:1

1,66:11,66:1

2,39:12,39:1

Fig. 1

2.7 Audio characteristics

Audio content must be produced in compliance with the audio/video industry standards, worldwide shared regulations, and best practices. Audio source must be produced with the necessary cautionary measures, making sure that noise, radio interference, interruptions or distortions are not present.

Audio levels shall be consistent throughout the programme.

Audio tracks must not show alterations in dynamic and/or bandwidth due to the actions of noise reduction systems, or of coding/decoding systems with an insufficient number of bits.

Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth.

Rai recommends all HD programmes to be produced with multichannel audio (5.1), apart from different editorial requirements of stereophonic audio (2.0).

Page 7: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

7

33 DDeelliivveerryy ffoorrmmaatt The HD content must be delivered to Rai on 120 mm optical re-writable medium ―Professional Disc‖ XDCAM HD 4:2:2.

The delivered product shall be originated from the master with the best possible quality, minimising the number of multi-generations.

As alternative to the optical medium, Rai may require an MXF format file compatible with ―Sony MPEG Long GOP products‖. In this respect, make reference to document SMPTE RDD9-2009 – MXF Interoperability Specification of Sony MPEG Long GOP Products.

Supplier‘s procedure to deliver the MXF file shall be agreed with Rai.

As for high-quality productions possibly produced with formats over 2K, Rai may also require supply of medium in format:

HDCAM-SR at 440Mb/s (Standard Quality) or 880 Mb/s (High Quality).

3.1 Long programmes

In case a programme lasts more than the maximum recording capacity of the medium, the programme will be recorded in two or more media.

In the latter case, the programme images at the end of a medium must connect with the ones at the beginning of the following medium with no superimposition. Media are numbered in sequence starting from 1.

The programme timecode must be progressive among several media, in such a way that there is no discontinuity between the last useful frame of the portion of programme recorded in a medium and the first frame of the programme portion in the following medium (paragraph 4.7).

The recording structure in each medium shall nevertheless respect provisions in paragraph 4.

Each medium shall not include more than one programme, apart from the case of compilation of homogenous and very short programmes.

3.2 Accompanying paperwork and labelling

Each medium shall be accompanied with all the information necessary for its identification. In particular, each medium shall be provided with a label (placed in the proper spaces) necessarily including:

Production Company and contact data

Rai agreement number

Programme full title

Secondary Title and/or number of the series

Medium number (written in format 1/N, 2/N, etc., where N is the total number of media making up the programme); in case the programme is written in just one medium, please use the writing ―1/1‖

Audio signal format (stereo, multichannel)

Programme Loudness Level (ref. EBU R128)

Loudness Range (ref. EBU R128)

Maximum True Peak Level (ref. EBU R128)

Page 8: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

8

Video image format

Original image format

The same information shall be present in the label on the medium‘s case (container).

Moreover, a technical sheet shall be attached, including the aforesaid information as well as:

Overall programme time

Medium programme time (in case of several media)

Timecode of the first useful frame of the programme

Presence of upconverted content or content with lower quality for definite editorial needs

Loudness value (see paragraph 4.11)

Special notes, relevant for correct broadcasting

3.3 Metadata

Rai reserves the right to prepare a list of metadata to be processed in IT format (e.g. Excel® spreadsheet) to be supplied together with the Master. In that case, delivery procedure, formats and structure of metadata required will also be specified.

3.4 Subtitles

In presence of Italian and/or English subtitles, they must be supplied in a file whose format shall comply with EBU Recommendations Tech N19 and Tech 3264-E. The file shall be delivered to the requiring department in a CD or by e-mail. In addition, it can be stored in the user data space of the XDCAM disk.

44 PPaacckkaaggee ooff rreeccoorrddeedd mmaatteerriiaall

4.1 Protection sequence

Just for the HDCAM-SR medium, a protection sequence of at least 10 seconds unrecorded tape portion shall be at the opening of the recording.

4.2 Alignment sequence

The reference signals are to be recorded immediately after the protection sequence (present just for the HDCAM-SR medium). The timecode track begins at the same time with the beginning of the alignment sequence. The alignment signals are:

Video: SMPTE colour bars with 75% saturation

Audio: continuous 1 kHz tone on alignment level. In case of stereophonic signal, the left channel (channel 1) is identified with the interruption of the 1 kHz tone for 250 ms every 3 seconds (see EBU Tech 3304 – 2005; former EBU Tech R49 – 1999).

Page 9: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

9

4.3 Identification sequence

The programme must be identified by a static image lasting 15 seconds at most; the image must show the essential information of the programme (consistent with the accompanying paper and the disk label). Essential information are:

Production house

Title

Subtitle and/or series number

Medium number (written in format 1/N, 2/N, etc., where N is the total number of tapes making up the programme); in case the programme is written in just one tape, please use the writing ―1/1‖

Audio format

Video format

Original video format

Duration

4.4 Start sequence

It must last 10 seconds and include countdown with round clock. The counting is interrupted 2 seconds before the programme starts. The aspect ratio should be 1.78 (16:9). The audio tracks should include silence for the whole length of the sequence. A timecode track consistent with previous parts should nevertheless be present.

4.5 Programme

The timecode track must be consistent with the previous recording portions without gaps or interruptions for the whole programme.

4.6 End sequence

At the programme end at least 30 seconds of black video and silence must be present; the timecode track is nevertheless present and consistent with the programme.

The recording structure is summarised in Table 1.

Table 1

Tape section

Time (sec.) Video Audio

Protection sequence1

10‖ (minimum) not recorded not recorded

Alignment sequence

60‖ (minimum) 75% SMPTE colour bars

1 kHz on reference level

Identification sequence

15‖ (maximum) programme visual identification

silence or sound identification

Start sequence

10‖ countdown silence

Programme programme duration

programme video programme audio

End sequence

30‖ (minimum) black silence

1 Present just in case of HDCAM- SR tape

Page 10: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

10

4.7 Timecode

All media (including file) should be supplied with timecode.

The timecode signal must comply with SMPTE 12M or EBU N12-1994 regulations.

As for 50 Hz progressive formats, an SMPTE recommendation is expected; in the meantime, refer to the current regulations (SMPTE 12M or EBU N12-1994).

The timecode corresponding to the first frame of the programme must be 01:00:00:00 or 10:00:00:00.

The following table includes an example of timecode values in case of first programme frame with 01:00:00:00 timecode, with reference to the recording structure described in paragraph 4.

Table 2. Example of timecode for a programme lasting 90 minutes

Timecode Start Medium 1

Tape section Time

No timecode protection sequence 10‖

00:58:35:00 alignment sequence 60‖

00:59:35:00 identification sequence 15‖

00:59:50:00 start sequence 10‖

01:00:00:00 Programme 90”

02:30:00:00 end sequence 30‖

Different timecode values may be accepted through prior agreement with Rai. In any case, the timecode signal within the single medium must be continuous, consistent and with no errors, and shall not go through zero in any point of the whole recording (including start and end sequences).

4.8 Audio track assignment

Audio tracks 1 and 2 of the medium shall include the stereo downmix (possibly encoded in Dolby Surround) of the multichannel audio, in compliance with EBU R48 regulation.

The multichannel audio will be allocated on tracks 3-8 pursuant to the following scheme (ref: R48 scheme 11b):

Table 3

Track Channel

1 Lt Left mix (total) Left downmix Italian language

2 Rt Right mix (total) Right downmix Italian language

3 L Left

5.1 audio Italian language

4 R Right

5 C Centre

6 LFE Low Frequency E.

7 LS Left Surround

8 RS Right Surround

Page 11: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

11

Since the XDCAM medium is limited to 8 audio tracks, in case of programmes with a second language, a separate medium for the second language should be prepared, with the same audio track allocation.

In case of programmes with a second language but with stereophonic audio, the second language should be allocated on tracks 3 and 4.

A later version of the document will deal with the possibility to receive media with encoded multichannel audio in Dolby E.

In case of programmes provided with audio description, a new medium with audio tracks arranged as per Table 4 must be produced.

Table 4

Track Channel

1 Lt Left mix (total) Left downmix Italian language

2 Rt Right mix (total) Right downmix Italian language

3-6 DE Silence Not used

7-8 Mono Single track or dual mono

Teleaudio: audio description for blind /visually impaired

In case of programmes with just stereophonic audio, the teleaudio track can be allocated in the same medium.

4.9 Audio level and channel identification

Audio levels are defined in document ITU-R BS.1726. The alignment level (reference) must be -18 dBFS (EBU Rec. R68 – PPM 4 on an instrument BBC PPM – IEC type IIa).

Tones should be at Alignment Level (AL) equalling -18 dB FS (EBU Rec. R68 – PPM4 on an instrument PPM BBC – IEC type IIa), with width tolerance not over +/- 0.1 dB. The frequency of alignment tones should be 1 kHz +/- 100 Hz for main channels L, C, R and for surround channels LS and RS, and 80 Hz +/- 5 Hz for LFE channel.

Channel identification shall comply with EBU Tech 3304; for all the channels, 3 seconds of 1 KHz tone are followed by 0.5 seconds of silence, then channels are identified clockwise, starting from the L channel. The identification signal is a 1 kHz tone 0.5 seconds long, followed by 0.5 seconds silence before the tone in the following channel; after a final pause of 0.5 seconds, the sequence is repeated starting from the 3 seconds of continuous and consistent tone for all the main channels.

The time necessary for each identification sequence depends on the number of channels of the chosen format (for example, 6.0 seconds for 5.1 or 5.0 audio), and thus it indirectly reveals the multichannel format chosen.

The whole identification sequence should be repeated at least 4 times; in the interval between the end of the last sequence and the end of colour bars, the 1 kHz tone shall be kept active for all the main channels.

The 80 Hz tone for the LFE channel is continuous for the whole length of the sequence. Although the tone is recorded at alignment level, the LFE channel is conventionally reproduced at a 10 dB higher level with respect to the main channels; as a consequence, a certain degree of balancing in the perceived sound intensity is respected.

Page 12: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

12

4.10 Maximum audio level

The maximum audio level is connected to the concept of ―True-Peak Audio Level‖ described in recommendation ITU-R BS.1770.

The Maximum True Peak Audio Level of the programme shall respect provisions in EBU recommendation R128.

The Maximum True Peak Audio Level shall be measured through an instrument complying with the procedures described in Recommendation ITU-R BS.1770.

4.11 Programme Loudness

Programme Loudness – average value of programme loudness – is defined in EBU Recommendation R128 and shall be measured by means of instruments in compliance with the procedure described in Recommendation ITU-R BS.1770.

The measurement of Programme Loudness shall refer to the whole length of the programme, starting from the first useful picture and excluding technical signals in the beginning and in the end.

The Programme Loudness Level measures shall comply with the Target Level defined in EBU Recommendation R128.

The same Programme Loudness Level shall be granted for the different soundtracks associated with the video (e.g. stereo and multichannel).

4.12 Lipsync

The time relationship (synchronisation) between the audio signal and video image shall not show perceptible errors.

Lipsync shall comply with EBU Recommendation R37: the audio signal shall neither precede the video image by more than 5 ms, nor follow it by more then 15 ms.

55 QQuuaalliittyy rreeqquuiirreemmeennttss RAI requires quality for sound and image equal to degree 5 - ―Excellent‖ - in the evaluation scale of Recommendation ITU-BT 500.

The medium shall be intact (with no abrasions, breaks or mechanical defects).

The medium must not have any imperfections causing incorrect reproduction of the audio/video content.

As for the audio/video content:

The image shall be framed with respect to the TV frame, in compliance with paragraph 2.4;

The image shall have no defects such as loss of resolution and scrambling noise; the audio shall have no distortions.

Page 13: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

13

66 SSuummmmaarryy ooff tteecchhnniiccaall rreegguullaattiioonnss aanndd rreeccoommmmeennddaattiioonnss

Identification Title

ITU-R BT.709 Parameter values for the HDTV standards for production and international programme exchange

ITU-R BS.1726 Signal level of digital audio accompanying television in international programme exchange

ITU-R BS.775 Multichannel stereophonic sound system with and without accompanying picture

ITU-R BS.1738 Identification and ordering of multiple audio channels carried on international contribution circuits

ITU-R BR.1384 Parameters for international exchange of multi-channel sound recordings with or without accompanying picture

ITU-R BT.500 Methodology for the subjective assessment of the quality of television pictures

ITU-R BS.1770 Algorithms to measure audio programme loudness and true-peak audio level

ITU-R BS.1771 Requirements for loudness and true-peak indicating meters

ITU-R BT.1702 Guidance for the reduction of photosensitive epileptic seizures caused by television

SMPTE 292M Bit-Serial Interfaces for High-Definition Television Systems

SMPTE 274M 1920x1080 Scanning and Analog and Parallel Digital Interfaces for Multiple Picture Rates

SMPTE 296M 1280x720 Progressive Image Sample Structure — Analog and Digital Representation and Analog Interface

SMPTE 377M Material Exchange Format (MXF) – File Format Specification

SMPTE RP 218 Specifications for Safe Action and Safe Title Areas for Television Systems

SMPTE 12M Transmission of Time Code in the Ancillary Data Space

EBU R124 Choice of HDTV Compression Algorithm and Bitrate for Acquisition, Production & Distribution

EBU R37 Relative timing of the sound and vision components of a television signal

EBU R68 Alignment level in digital audio production equipment and in digital audio recorders

EBU R128 Loudness normalisation and permitted maximum level of audio signals

EBU Tech 3341 Loudness Metering: ‗EBU Mode‘ metering to supplement loudness normalisation in accordance with EBU R 128

EBU Tech 3342 Loudness Range: A measure to supplement loudness normalisation in accordance with EBU R 128

EBU Tech 3320 User requirements for Video Monitors in Television Production

Page 14: Technical Specifications for High-Definition Programmes · insufficient number of bits. Audio acquisition shall be made through a 48 kHz sampling with 24 bit depth. Rai recommends

Technical Specification for High Definition Programmes V 1.07

14

77 GGlloossssaarryy

EBU: European Broadcasting Union

ITU: International Telecommunication Union

SMPTE: Society of Motion Picture and Television Engineers

HD: High Definition

VBI: Vertical Blanking Interval

XDCAM HD 422: Sony apparatus recording HD 4:2:2 video formats in the MPEG-2 @ 50 Mbit/s format

HDCAM-SR ® (Superior Resolution): Sony MPEG-4 SP (Studio Profile) format/medium

Dolby ® E: Dolby proprietary format of audio encoding with transport onto digital interface AES/EBU

PML (Permitted Maximum Level): maximum audio level permitted within a TV programme

dBFS: decibels relative to full scale

LKFS: Loudness, K weighted, relative to nominal full scale: it is the loudness unit of measurement in Recommendation ITU-R BS 1770.