Multimedia Systems - Sharifce.sharif.edu/courses/89-90/2/ce342-1/resources/root... · 2020. 9....
Transcript of Multimedia Systems - Sharifce.sharif.edu/courses/89-90/2/ce342-1/resources/root... · 2020. 9....
Multimedia Systems
Speech I
Mahdi Amiri
February 2011
Sharif University of Technology
Course Presentation
Page 1 Multimedia Systems, Speech I
Sound
Sound is a sequence of waves of pressure which propagates through
compressible media such as air or water.
Digital Representation of
an Analog Signal
Sampling and Quantization
Parameters:
Sampling Rate (Samples per Second)
Quantization Levels (Bits per Sample)
This is a form of coding too:
Pulse-code modulation (PCM)
Basics
Page 2 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)Why Call it PCM?
4-bit PCM
Page 3 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
How to choose proper…
Sampling Rate
8 Khz ?
Quantization Level
8 bit/sample ?
Bit per Second for 8000 Hz 8 bit PCM
64 kbit/s
Bit per Second (bit/s or bps)
Page 4 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Human Hearing Frequency Range
20 Hz to 20 kHz
Play with “Audacity”
tone generator to test
your hearing
Most people will find that their hearing is
most sensitive around 1-4 kHz and that it is less
sensitive at high and low frequencies.
Sampling Rate
Page 5 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)Hearing Range
Ferret
Porpoise
Check out
Freq. Alloc. Table
Page 6 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Human Vocal Range
Normal: 80 Hz to 1100 Hz
Charles Kellogg (14 KHz) (not verified)
Guinness Book of Records
Female: Georgia Brown
(Eight octaves, 25087Hz)
Male: Tim Storms
(Six octaves)
Sampling Rate
Page 7 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
8,000 Hz - Telephone, adequate for human speech.
11,025 Hz – lower quality PCM (one quarter the sampling rate of audio CDs).
22,050 Hz – Radio.
32,000 Hz - miniDV digital video camcorder, DAT (LP mode).
44,100 Hz - Audio CD, also most commonly used with MPEG-1 audio (VCD, SVCD,
MP3) (Originally chosen by Sony, 1979).
48,000 Hz - Digital sound used for miniDV, digital TV, DVD, DAT, films and
professional audio.
96,000 or 192,000 Hz - DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray
Disc) audio tracks, and HD-DVD (High-Definition DVD) audio tracks.
2.8224 MHz - Super Audio CD (SACD), 1-bit sigma-delta modulation process known
as Direct Stream Digital (DSD), co-developed by Sony and Philips.
5.6448 MHz - Double-Rate DSD, 1-bit Direct Stream Digital at 2x the rate of the
SACD. Used in some professional DSD recorders (128 * 44100 Hz).
DXD - 24-bit sampled at 352.8 kHz, suited for editing, eq. with 8.4672 MHz 1-bit DSD
Common Sampling Rates
Page 8 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
The trademark name used by Sony and Philips.
Uses pulse-density modulation encoding
Direct Stream Digital
Page 9 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)Quantization, Images
Page 10 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Simple and popular
midtread
odd number of
reconstruction levels
Uniform Quantizer, Midtread
Quantization error for bounded input
Page 11 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Simple and popular
midtrise
even number of
reconstruction levels
Uniform Quantizer, Midtrise
Page 12 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Want to prevent human ear fatigue by
minimizing quantization noise
Signal-to-Noise Ratio = 6.02*B dB
SNR is approximately 6 dB per bit.
16-bit => 96 dB
Above 36 dB is required
Quantization Levels
Page 13 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)6 dB per bit rule of thump
ˆe x xn n n
2 2e n
m mX x n X
Assumption: is uniform over ( , ]2 2
e n
The probability density function of e[n]
2
2
m
B
X
Average power of a process or signal:
2 2
x xx p x dx
x p x : Probability density function: Mean
2 2
1010logdB x eSNR B
: Variance
22
22 22 2
22 2
1
12 2 3
me e B
Xe p e de e de
2 2 2
10
2 2
10 10
10log 2 3
20log 2 10log 3
B
dB x m
x m
SNR B X
B X
2 2
106.02 10log 3dB x mSNR B B X
Page 14 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)6 dB per bit rule of thump
Example 1: SNR for Uniform Quantization of Uniformly-Distributed Input
2 2
106.02 10log 3dB x mSNR B B X
2 2 3x mX 6.02dBSNR B B
Example 2: SNR for Uniform Quantization of Sinusoidal Input
2 2 2x mX 6.02 1.76dBSNR B B
Example 3: SNR for Uniform Quantization of Gaussian Input
2 2 16x mX 6.02 7.27dBSNR B B
Page 15 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
The average person cannot tell the difference between a bitrate above 192
kbit/s and the original CD/WAV.
Even if your headphones seal really well around your ears, they will
probably only give you about 20 to 25 dB insulation from the external sound
Good to Know
20 ~ 25 dB insulation
Noise level for 192 kbps audio is under -125 db and certainly inaudible
Page 16 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)Nonuniform Quantization
(a) Uniform and (b) non-uniform quantization Q(x) and quantization error q(x)
Typical speech signal and its histogram
Page 17 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)
Nonuniform quantizers: Difficult to make, Expensive.
Solution: Companding Uniform Q. Expanding
u-law, a-law
Page 18 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)u-law, a-law
Page 19 Multimedia Systems, Speech I
Pulse-code Modulation (PCM)u-law, a-law
u-law
North America and Japan
a-law
Europe
Page 20 Multimedia Systems, Speech I
Differential PCM (DPCM)Idea
Take advantage of data redundancy
[… 110 112 111 112 112 114 115 115 114 114… ] [… +2 -1 +1 0 +2 +1 0 -1 0 …]
Page 21 Multimedia Systems, Speech I
Differential PCM (DPCM)Basic Scheme
1Delta Modulation (DM): i n ia x z
Problem?
General Predictive Coding
Page 22 Multimedia Systems, Speech I
Differential PCM (DPCM)Better Structure
Page 23 Multimedia Systems, Speech I
Thank You
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
FIND OUT MORE AT...
Multimedia Systems
Speech I
Next Session: Speech II