Audio -...

10
1 Audio REPRESNTATION Audio speech signals as used in a variety of interpersonal applications including telephony and video telephony music-quality audio as used in applications such as CD-on- demand and broadcast television. audio can be produced synthesizer , the audio is created in a digital form and hence can be readily stored within the computer memory. A microphone , however, generates a time-varying analog signal and in order to store such signals in the memory of a computer, and to transmit them over a digital network, they must first be converted into a digital form using an audio signal encoder. Also, since loudspeakers operate using an analog signal, on output of all digitized audio signals the stream of digitized values must be converted back again into its analog form using an audio signal decoder. We will explain the digitization of both speech and music produced by a microphone. We shall then discuss the format of synthesized audio.

Transcript of Audio -...

Page 1: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

1

Audio REPRESNTATION

Audio

• speech signals as used in a variety of interpersonal applications including telephony and video telephony

• music-quality audio as used in applications such as CD-on- demand and broadcast television.

• audio can be produced synthesizer, the audio is created in a digital form and hence can be readily stored within the computer memory.

• A microphone, however, generates a time-varying analog signal and in order to store such signals in the memory of a computer, and to transmit them over a digital network, they must first be converted into a digital form using an audio signal encoder.

• Also, since loudspeakers operate using an analog signal, on output of all digitized audio signals the stream of digitized values must be converted back again into its analog form using an audio signal decoder.

•We will explain the digitization of both speech and music produced by a microphone.

•We shall then discuss the format of synthesized audio.

Page 2: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

2

• The bandwidth of a typical speech signal is from 50 Hz through to 10kHz and that of a music signal from 15Hz through to 20kHz.

• Hence the sampling rate used for the signals must be in excess of their Nyquist rate which is 20ksps (2 x 10kHz) for speech and 40ksps (2 x 20kHz) for music.

• The number of bits per sample must be chosen so that the quantization noise generated by the sampling process is at an acceptable level relative to the minimum signal level. In the case of speech12 bits per sample and for music 16 bits.

• In addition, since in most applications involving music stereophonic (stereo) sound is utilized (and hence two such signals must be digitized) this results in a bit rate double that of a monaural (mono) signal.

Example 2.4

PCM speech• Most interpersonal applications involving speech is (PSTN).

• The bandwidth of a speech circuit in this network was limited to 200 Hz through to 3.4kHz. Sampling rate of 8 kHz was required to avoid aliasing.

• In order to minimize the resulting bit rate, • 7 bits per sample were selected for use in NA and Japan • 8 bits per sample in Europe • Bit rates of 56 kbps and 64kbps respectively. • More modern systems have moved to using 8 bits per sample

in each case, giving a much improved performance over early 7 bit systems.

PCM Principles: Signal Encoding and Decoding Schematic

Page 3: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

3

• To reduce the effect of quantization noise the quantization intervals are made non-linear (unequal) with narrower intervals used for smaller amplitude signals than for larger signals.

• This is achieved by means of the compressor circuit.

• At the destination, the reverse operation is performed by the expander circuit.

• The overall operation is known as Companding.

PCM Principles: Companding• International Standard:

– ITU-T Recommendation G.711• Companding employs a non-linear or un-equal set of

quantization steps– Linear quantization produces quantization noise that is

independent of signal level– Ear is more sensitive to noise on quiet signals than on– Finer quantization at lower levels provides an increased

signal quality, especially 8-Bit PCM

PCM Principles: Compressor Characteristic PCM Principles: Expander Characteristic

Page 4: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

4

• In practice, for historical reasons, there are two different compression- expansion characteristics in use:

• µµµµ-1aw, which is used in North America and Japan, and

• A-law which is used in Europe and some other countries.

2.5.2 CD-quality audio

• The discs used in CD players and CD-ROMs are digital storage devices for stereophonic music and more general multimedia information streams.

• There is a standard associated with these devices which is known as the CD- digital audio (CD-DA) standard.

CD-Quality Audio• Music Bandwidth = 15Hz-20KHz

– Minimum sampling rate = 40ksps– Actual sampling rate = 44.1ksps– 16bis/sample– Total Bit rate = 44.1 x 103 x 16 = 705.5 Kbps– Stereo means that the total bit rate is 1.411 Mbps

• Much greater than the 64kbps of a PCM telephone channel.

Example 2.5

Page 5: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

5

Synthesized audio• synthesized audio is often used in multimedia

applications• The amount of memory required is less than that

required to store the equivalent digitized waveform version.

• In addition, it is much easier to edit synthesized audio and to mix several passages together.

• The main components that make up an audio synthesizer are shown in Figure 2.18.

Audio/Sound Synthesizer Schematic

• The secondary storage interface allows the sequence of messages to a particular piece of audio to be saved on secondary storage.

• programs to allow the user to edit a previously entered passage and, if required, to mix several stored passages together.

• (piano) keyboard, there is a range of other possible inputs from instruments such as an electric guitar, all of which generate messages similar to those produced by the keyboard.

• in order to discriminate between the inputs from the different possible sources, a standard known as the Music Instrument Digital Interface (MIDI) is used. As the name implies, this does not just define the format of the standardized set of messages used by a synthesizer, but also the type of connectors, cables, and electrical signals that are used to connect any type of device to the synthesizer.

VIDEO REPRESNTATION

Page 6: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

6

Interlaced Scanning Principles

•NTSC employs interlaced scanning•Due to bandwidth limitations in the first half of the 20th century

Color Source Properties• Brightness

– The amount of energy that hits the eye• Hue

– The actual Color of the source• Saturation

– Vividness of the color– Pastels have a lower saturation. e.g. pink has a

lower saturation level than red• Television transmission does not employ an RGB

color space

Page 7: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

7

Luminance and Chrominance• To transmit RGB requires three times the bandwidth of a

Black & White video• Transform the RGB to a Luminance-Chrominance color space

as most of the bandwidth is in the Luminance plane– Driven by limited bandwidth available– A B&W TV can receive and display directly on a color composite

video signal broadcast– Very Clever

• Two Analog Luminance-Chrominance color spaces– NTSC (YIQ)– PAL (YUV)– Computer (YCbCr) (digital)

Color Transformations

( )( )

( ) ( )( ) ( )YBYRQ

YBYRIBGRY

NTSCYRVYBU

BGRYPAL

−+−=−−−=

++=

−=−=

++=

41.048.027.074.0

114.0587.0299.0:

877.0493.0

114.0587.0299.0:

Transformations are different as the PAL & NTSC primaries are not the same

Example 2.6 Baseband Spectrum of Color Television Signals: NTSC System

In NTSC, theeye is more sensitive to I than Q, hence more bandwidth

I & Q are modulated in quadrature to occupy the same spectrum

Page 8: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

8

Baseband Spectrum of Color Television Signals: PAL System Sample Positions with 4:2:2 Digitization Format

CCIR-601• 4:2:2 is the original digitization format used in CCIR-

601• RGB bandwidths each up to 6MHz• Sampling Rates

– Minimum of 12Msps for Y and 6Msps for Cb and Cr– 13.5Msps for Y and 6.75Msps for Cb and Cr

• 13.5Msps is the nearest 12Msps resulting in a whole number of samples/line

– 625 line system• 62microsec sweep - 12 microsec blank• (62-12)x106x13.5x106=702 samples/line

Example 2.7

Page 9: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

9

Sample Positions in 4:2:0 Digitization Format HDTV Formats: SIF• Source Intermediate Format: SIF• Source quality comparable to VCRs• Resolutions

– 525-line system: • Y = 360x240• Cb=Cr = 180x120

– 625-Line system:• Y = 360x288• Cb=Cr = 180x114

• Worst case bit rate– 6.75x106X8+2(1.6875x106x8) = 81Mbps

HDTV Formats: CIF• Common Intermediate Format: CIF

– Y = 360x288– Cb = Cr = 180x144– Progressive Scan– 30 Hz

• 4CIF– Y = 720x576– Cb = Cr = 360x288

• 16CIF– Y = 1440x1152– Cb = Cr = 720x576

HDTV Formats: CIF

• Quarter CIF: QCIF– Y = 180x144– Cb = Cr = 90x72

• Data Rate– 3.375x106X8+2(0.84375x106x8) = 81Mbps

Page 10: Audio - userspages.uob.edu.bhuserspages.uob.edu.bh/mangoud/mohab/Courses_files/ho_multimedia7.… · digital form using an audio signal encoder. ... again into its analog form using

10

Multimedia CommunicationsStandards and Applications Sample Positions for SIF and CIF

Sample Positions for QCIF