Glossary
There is a signal processing glossary on a page
of its own.
For a more exhaustive list of English-Finnish translations, see the
Audiosignaalinkäsittelyn sanasto by Vesa Välimäki.
Primitive Signals
Sine wave
- The sine wave is more or less the building block of all signals, musical
or not.
- There is exactly one frequency present in a signal with one steady sine
wave.
- Three parameters, the frequency, the amplitude and the
initial phase, characterize every steady sine wave completely.
- The Fourier transform can be used to inspect what kinds of sine waves
there are in a signal.
fs = 44100;
t = 0:1/fs:0.001;
s = sin(2 * pi * 1700 * t);
subplot(211), stem(abs(fft(s))), title('abs(fft(s))')
subplot(212), stem(s), title('s')
Noise
- white noise has an equal amount of energy
on every frequency
- in music, there is often band-limited noise present
fs = 44100;
n = randn(fs, 1);
n = n / max(abs(n));
subplot(211)
plot(n), axis tight
subplot(212)
specgram(n)
Speech
- The speech sample shown is the finnish word
"seitsemän"
- You can listen to the speech sample.
[s, fs] = wavread('seiska.wav');
plot(s),axis tight,grid on
The following is the spectrogram of the above speech sound.
specgram(s, 512, fs);
colorbar
Piano
- The piano sample shown is the middle C, whose fundamental
frequency is 261 Hz.
- The piano sample is an example of a harmonic sound; this means
that the sound consists of sine waves which are integer multiples of the
fundamental frequency. (Actually the piano is not perfectly harmonic.)
- You can listen to the piano sample.
[s, fs] = wavread('pia60.wav');
plot(s),axis tight,grid on
The following is the spectrogram of the above piano sound, resampled
to 16000 Hz sample rate. Here the overtones can be seen clearly.
s2 = resample(s, 1, 3);
specgram(s2, 512, fs / 3);
colorbar
Snare drum
- The snare drum sample doesn't have a fundamental frequency nor does it
have overtones.
- You can listen to the snare drum sample.
[s, fs] = wavread('snareHit.wav');
plot(s),axis tight,grid on
And here is the spectrogram of the snare drum hit, without resampling.
Notice the lack of harmonic content.
specgram(s, 512, fs);
colorbar
Linear Filters
- There are broadly two kinds of digital linear filters: finite impulse
response (FIR) and infinite impulse response (IIR) filters.
- Comparison of FIR vs. IIR filters:
| FIR | IIR |
| linear phase response possible | yes | no |
| overall frequency response control | good | bad |
| nearly-"brickwall" response possible | no | yes |
| efficiency (multiplications required) | bad | good |
- Conclusion: in audio applications, when the phase response isn't
critical, it is often profitable to use IIR filters because of their
efficiency.
fir_b = remez(30,[0 0.2 0.3 1], [1 1 0 0]);
[iir_b, iir_a] = butter(10, 0.2);
subplot(211)
impz(fir_b, 1, 41)
title('An FIR filter'), axis([0 40 -0.1 0.25]), grid on
subplot(212)
impz(iir_b, iir_a, 41)
title('An IIR filter'), axis([0 40 -0.1 0.25]), grid on
Time and Frequency Domains
- The Fourier transform can be used to find out the frequency domain
representation of a time domain signal. The inverse Fourier transform
converts a frequency domain representation into time domain.
- When the MATLAB FFT function is used to compute the Fourier
transform, the resulting vector will contain amplitude and phase
information on positive and negative frequencies. The positive and
negative frequencies will be equal, iff the time-domain signal is real.
- The Fourier transform decomposes a signal into a sum of stationary
sinusoids. Therefore, when a whole regular sound signal is transformed,
the changes in frequency content cannot be observed. Therefore
short-time windowed FFT is usually used to observe the instantaneous
frequency content.
[s, fs] = wavread('snareHit.wav');
subplot(211), plot(abs(fft(s))), title('abs(fft(s))')
subplot(212), plot(s), title('s')
- The following spectrogram is as computed above, using 11.6 ms windows
which overlap by 50%
- the spectrum displayed below the spectrogram is taken at 0.2 seconds time
u = s(0.2 * fs:0.2 * fs + 511) .* hanning(512);
U = fft(u);
f = (0:256) / 256 * fs / 2;
plot(f, 20 * log10(abs(U(1:257))))
axis tight,grid on
xlabel('frequency [Hz]')
ylabel('amplitude [dB]')
Windowing
- short-time signal processing is practically always done using
windowing
- in short-time signal processing, signals are cut into small pieces called
frames, which are processed one at a time
- frames are windowed with a window function in order to
improve the frequency-domain representation
- what windowing essentially means is multiplying the signal frame with the
window function point-by-point
[s, fs] = wavread('pia60.wav', [5000 6000]);
subplot(131)
plot(s(1:512))
title('s(1:512)'), axis tight, grid on
subplot(132)
plot(hanning(512),'r')
title('hanning(512)'), axis tight, grid on
subplot(133)
plot(s(1:512) .* hanning(512))
title('s(1:512) .* hanning(512)'), axis tight, grid on
S1 = fft(s(1:512));
S2 = fft(s(1:512) .* hanning(512));
f = (0:256) / 256 * fs / 2;
plot(f, 20 * log10(abs(S1(1:257))))
hold on
plot(f, 20 * log10(abs(S2(1:257))), 'r')
axis tight,grid on
xlabel('frequency [Hz]')
ylabel('amplitude [dB]')
Correlation
- the cross-correlation between two signals tells how `identical'
the signals are
- in other words, if there is correlation between the signals, then the
signals are more or less dependant on each other
- for example, the correlation between two sine waves with different periods
is zero
t = 0:1/fs:0.2;
a = 2 * sin(2 * pi * 20 * t);
b = 2 * sin(2 * pi * 30 * t);
ep(a, b)
sum(a .* b)
ans =
-4.7622e-12
- normally the correlation value is computed with different alignments,
called lags, between the signals
- the correlation value is computed between a[n] and b[n - l], a[n] and
b[n - l + 1], a[n] and b[n - l + 2], ..., a[n] and b[n - 1], a[n] and
b[n], a[n] and b[n + 1], ..., a[n] and b[n + l - 1] and finally between
a[n] and b[n + l]
- i.e. the other signal is held static and the other signal is shifted one
sample at a time and the correlation value is computed every time
- autocorrelation means the cross-correlation of a signal with
itself
- the autocorrelation value on lag 0 is equal to the energy of the signal
subplot(211)
n = randn(4000, 1);
[ac, l] = xcorr(n, n, 1000);
plot(l, ac), axis tight, grid on
title('gaussian noise autocorrelation')
subplot(212)
[s, fs] = wavread('pia60.wav');
[ac, l] = xcorr(s, s, 1000);
plot(l, ac), axis tight, grid on
title('piano autocorrelation')
Frequency quiz
- here are 6 sine waves:
sine1,
sine2,
sine3,
sine4,
sine5,
sine6
- what frequency belongs to which sine:
(a) 61 Hz, (b) 688 Hz, (c) 1364 Hz, (d) 4539 Hz, (e) 8200 Hz, (f) 13954 Hz?
fs = 44100;
f = 2 .^ (10 * rand(1, 6) + 4.2);
t = 0:1/fs:1;
for i = 1:length(f)
s = 0.2 * sin(2 * pi * f(i) * t);
wavwrite(s, fs, 16, ['sine' num2str(i) '.wav']);
end
[s, fs] = wavread('helmi.wav');
f = 2 .^ (7 * rand(1, 6) + 6);
for i = 1:length(f)
ff = [0.6 0.9 1.1 1.4] * f(i) * 2 / fs;
ff = [0 ff 1];
b = remez(300, ff, [0 0 1 1 0 0], [10 1 10]);
freqz(b, 1, 512, fs); drawnow
ss = 4 * filter(b, 1, s);
wavwrite(ss, fs, 16, ['filter' num2str(i) '.wav']);
end
- correct answers to the sine frequencies:
1-c (1364 Hz), 2-d (4539 Hz), 3-f (13954 Hz), 4-b (688 Hz), 5-e (8200 Hz)
and 6-a (61 Hz)
- correct answers to the filter bandpass frequencies:
1-e (2676 Hz), 2-b (552 Hz), 3-c (1300 Hz), 4-f (6480 Hz), 5-d (1428 Hz)
and 6-a (212 Hz)
References
-
Pohlmann, Ken,
``Principles of Digital Audio.''
3rd Edition,
ISBN 0-07-050468-7,
McGraw-Hill, Inc., 1995
-
Oppenheim, Alan V. and Schafer, Ronald W.,
``Discrete-Time Signal Processing.''
-
Ifeachor and Jervis,
``Digital Signal Processing.''
http://www.cs.tut.fi/sgn/arg/intro/
Last modified: Tue Jun 1 11:58:55 1999