SGN-1650 Signal Processing laboratory (Signaalinkäsittelyn Työkurssi SGN-1650)

Digital audio work

2009 – 2010

News and Updates:

Prerequisites

Vinyl audio denoising

Organizer: Mikko Parviainen mikko(dot)p(dot)parviainen(at)tut(dot)fi, room TF313. Please use email to ask questions.

1 Requirements

Before you submit your work check that you have obeyed the instructions explained in Requirements. You asked to correct and re-submit if there is a problem.

Make a written report with the following contents:

  1. Answers to the questions and exercises. While answering to a question please indicate to which exercise you answered. 1
  2. Well commented MATLAB M-files you wrote in this work2
  3. Pictures and figures asked
  4. The report must contain a list of reference literature and citations to this material.

1 The answers to the questions should be analytical. That is, you should not just answer "yes" or "no". For example, try to think of the reasons why a filter affects a signal at a certain manner. Remember to justify any arguments you had made properly or cite the reference literature or other (scientific) material.

2 Comment your code so that the idea of the algorithm is explained. You must provide also a test script which uses the algorithms that you have implemented.

You can return the report on paper or by email. However, you should also return the m-files you have written by email. If you decide to deliver a paper version of the report, please return it to a locked box labeled 311 on third floor, C-corridor in Tietotalo. Use ps- or pdf-format, if you decide to return also your report by email. Use address mikko(dot)p(dot)parviainen(at)tut(dot)fi to return your work.

The deadline: Firstly, you should contact me by email if you are interested to do Digital Audio Work, and I will fix a deadline for you. The period of this registration is limited to guarantee that you have time do this work.

Notice: Include your name(s), student number(s) and email address(es) in the report. I will acknowledge by email whether your report has been accepted.



2 Introduction

2.1 Vinyl audio denoising

In 2009 – 2010 digital audio work for the course SGN-1650 concern the task of vinyl audio denoising. This is a specific subject in past and current research in digital audio signal processing that came apparent with the birth of the CD.

The aim of vinyl audio denoising, as you may know, is to remove the clicks, cracks and glitches from audio which is coming from a vinyl record being played on a turntable. Dust, scratches and electrostatic charges are the primary causes for these cracks.

In an automatic vinyl audio denoising system it would be preferable to digitize an LP record as whole, run the digital sound through the denoiser device and then save the processed sound for later listening. This would require that the denoising algorithm would have to work unsupervised.

Exercise 2.1.1: What issues are related to this kind of an automatic denoising algorithm? What different alternatives are there? (See reference literature)

2.2 Sound samples

These are the sound samples which will be used throughout this work. This consists of four excerpts of music:

  1. genuine vinyl music with cracks: vinyl1.wav and vinyl2.wav (30 seconds) and
  2. excerpt of CD-quality music without cracks: cd1.wav (4 seconds) and
  3. the CD excerpt (2) with superimposed cracks cd2.wav (4 seconds).

The file cd2.wav is also stored in MATLAB mat format in the file cd2.mat, intended to be used in the wavelet denoising exercise (see section 6).

The music samples were selected to be more and less easy to denoise. This means that some music has got impulsive content in itself and some music has not.

The CD samples 2 and 3 are intended for numerical evaluation of the different algorithms. This is done via SNR comparisons (see section 2.5).

2.3 MATLAB help

We have included tips about which MATLAB commands to use in the exercises. These tips are labeled with "MATLAB help" in this page. You may type help command in MATLAB in order to get a verbose description and usage information about the commands.

You can use shorter excerpts of the signals while you are implementing and testing the algorithms. E.g. a 4-second excerpt from around the beginning of vinyl1.wav would be suitable for algorithm parameter testing purposes.

You can also plot multiple signals from within an algorithm into the same figure in order to examine them. You could, for example, plot x[n], w[n], b[n] and y[n] from the conditional median filtering algorithm (see section 5.3) in order to gain some information on the algorithm.

2.4 Segmentated processing

The linear and nonlinear filtering exercises are done in MATLAB. Because of limited memory or real time processing requirements applications processing digital audio must handle the audio in frames. This is also the case in this audio work. With a frame we mean a sequence of some hundreds or thousands of samples. The procedure of splitting up a longer (sound) signal into many short frames is usually called segmentation. The frames can also be called segments or windows.

In other words you must implement the linear and nonlinear filtering operations so that you

  1. read in a frame of input data (use frames of 1000 samples)
  2. filter these samples, producing an output frame of the same length as the input frame
  3. write the output frame to disk
Note: You should not read more than 1000 samples at a time. This is to say that you are not allowed to read the whole file.

MATLAB help:

Here are two functions for opening a WAV file for reading and reading data from a WAV file one segment at a time.

This is the file piwaopen.m:

function fid = piwaopen (filename)
%PIWAOPEN Open a WAV file for piecewise reading
%   FID = PIWAOPEN (FILENAME) opens a mono WAV file for reading with
%   PIWAREAD.  The opened file is closed with FCLOSE.

% The data format is 16-bit integer (2's complement, little endian)
[fid, msg] = fopen (filename, 'rb', 'l');
if (fid == -1), error (msg), end

% There's 44 bytes of header information in the 'cd1.wav', 'cd2.wav',
% 'vinyl1.wav' and 'vinyl2.wav' files.  The header is discarded.
fseek (fid, 44, 'bof');

This is the file piwaread.m:

function y = piwaread (fid, samples)
%PIWAREAD Piecewise WAV file reading
%   Y = PIWAREAD (FID, SAMPLES) reads a number of samples from a mono WAV
%   file opened with PIWAOPEN.  FID is a file descriptor as returned by
%   PIWAOPEN and SAMPLES is the number of samples to read.  Y may not be of
%   length SAMPLES if there are not enough samples left in the file.

% The data format is 16-bit integer (2's complement, little endian)
y = fread (fid, samples, 'short');

You can write data to a headerless ('raw') data file one segment at a time with the fopen and fwrite commands as follows:

fid = fopen ('output.raw', 'w', 'l');
for ... %your segmentation loop
  ... %processing
  fwrite (fid, y, 'short'); %y is the data vector
end
fclose (fid)

The resulting file will not be a WAV file though, but a raw 16-bit integer (2's complement, little endian) file.

2.5 Signal-to-noise ratio (SNR) evaluation

The signal-to-noise ratio (in decibels)


                        L-1
                        ___        2
                        \    (x[i])
                        /__
                        i=0
SNR = 10 log   ---------------------- dB          (1)
            10          L-1
                        ___        2
                        \    (e[i])
                        /__
                        i=0

where e[i] is the error signal between the processed noisy signal z[n] and the clean signal x[i]:
e[i] = x[i] - z[i]                                (2)
and L is the length of the signals x[n], z[n] and e[n]. Remember to compensate for the delay possibly caused by filtering before calculating the SNR.

In other words, x[n] is the signal cd1.wav, i.e. the clean CD signal, and z[n] is the noisy CD signal cd2.wav after processing. With an ideal denoising algorithm x[n] and z[n] should be identical and therefore the error e[n] should be all zeros.

Exercise 2.5.1: You are supposed to use the above SNR equations (1) and (2) for all the forthcoming denoising algorithms and list the SNR values in your report. That is, report the SNR values for the denoising of cd2.wav for each of the following algorithms:

  1. Manual denoising (section 3)
  2. Mean filtering (section 4.1)
  3. Optimal linear filtering (section 4.2)
  4. Median filtering (section 5.1)
  5. Double median filtering (section 5.2)
  6. Conditional median filtering (section 5.3)
  7. Wavelet denoising (section 6)

3 Manual denoising

The first part of the work is meant to be an introduction to the problem. In this part you need to load a sound with vinyl cracks into a sound editor, select a single crack and remove it by hand.

Exercise 3.1: Select a crack in the sound and remove it by hand. You should consider what to do with the crack and how to remove the crack best. Write a short description of this in the report.

Exercise 3.2: Plot an example of the crack before and after modifications. Some possible tools: MATLAB as well as some powerful audio analysis tools e.g. Adobe Audition (commercial). Free tools for Linux (Snd, GNUSound, ReZound, etc.) can be found at http://linux-sound.org/ (->Soundfile Editors & Utilities). Note that not all the editors have tools for this type of processing. Also the sound editors may include some algorithms that are meant to audio denoising, but may not produce the desired quality.


4 Linear filtering

4.1 Mean filtering

In this part your job is to implement the mean filter and test its suitability to vinyl audio denoising. The mean filter is defined in eq. (1), and is essentially a running mean of a number of samples.

y[n] = mean(x[n-N+1], x[n-N+2], ..., x[n-1], x[n])

              N-1
              ___
     = 1/N *  \     x[n-i]       (1)
              /__
              i=0

In the above equation N is the length of the filter.

Exercise 4.1.1: Implement a mean filter in MATLAB. Segmentate the input into frames. Use a few (3 is enough) different filter lengths.

MATLAB help:

4.2 Optimal filtering

Here you need to design and use a more capable linear filter, an optimal (minimax) filter. The design is carried out with the McClellan-Parks-Rabiner (MPR) algorithm.

The filter coefficients b[i] are entered into the difference equation (2):

        N-1
        ___
y[n] =  \     b[i] x[n-i]       (2)
        /__
        i=0

Exercise 4.2.1: Design an optimal linear filter in MATLAB and filter the signals with it. Use segmentation. What should the frequency response of the filter be like? Report your opinion on this and the motivation for it.

Exercise 4.2.2: Experiment with filter lengths and try to find a best filter length. Describe it in the report.

Exercise 4.2.3: Plot the frequency and impulse responses of the ultimate filter.

MATLAB help:


5 Nonlinear filtering

5.1 Median filtering

In this part you are required to implement a median filter. This is a nonlinear filter similar to the mean filter discussed above. In the median filter you implement an otherwise similar algorithm to the mean filter, except that the mean(...) operation is replaced with a median(...) operation.

y[n] = median(x[n-N+1], x[n-N+2], ..., x[n-1], x[n])    (x)

Here as well N is the length of the filter.

Exercise 5.1.1: Implement a median filter in MATLAB and filter the signals with it. Use segmentation. Use a few (3 is enough) different filter lengths and report the best filter length in your opinion.

Exercise 5.1.2: How does the median filter perform in removing the cracks when compared to the linear mean and optimal filters?

MATLAB help:

5.2 Double median filtering

The double median filter is composed of a median filter as in section 5.1 and a second median filter that filters the error signal. The error signal is the difference between the input signal and the filtered signal after the first median filter.

The double median filter
Figure 1. The double median filter.

The block diagram of a double median filter is illustrated in figure 1. In this figure there are the following blocks:

and the following signals:

The idea behind the double median filter is that the output signal quality should be enhanced with the summation of the median of the difference signal.

The lengths for the median filters can be changed; experiment with a few different combinations.

The double median filter is described in more detail in Rabiner et al. [1].

Exercise 5.2.1: Implement a double median filter in MATLAB and filter the signals with it. Use segmentation. Use 2 different filter lengths for both of the median filters and report the best combination of filter lengths in your opinion.

Exercise 5.2.2: How does the double median filter perform in removing the cracks when compared to the single median filter?

5.3 Conditional median filtering

The theory behind the conditional median filter is to

  1. detect the vinyl cracks and
  2. to filter only the cracks
and in this way to leave the rest of the music completely untouched.

The cracks resemble impulses and thus contain mostly high frequencies. On the other hand, a musical signal contains mostly relatively low frequencies; this is especially the case with audio from vinyl records. This frequency characterization is exploited in detecting the vinyl cracks and the interpolation is performed by a median filter.

The conditional median filter
Figure 2. The conditional median filter.

The block diagram of a conditional median filter is shown in figure 2. In this figure there are the following blocks:

and the following signals:
High pass filter

The high pass filter is a cascade of a second-order derivative and a root-mean-square (RMS) filter. The second order derivative is

z[n] = x[n-1] - 2 x[n] + x[n+1]               (x)
and the RMS filter
      	      ----------------------
      	     /        N  
      	_   /    1   ___          2
w[n] =	 | /   ----  \    (z[n+i])            (x)
      	 |/    2N+1  /__ 
      	 /           i=-N
where 2N+1 is the length of the RMS filter. The resulting signal w[n] has large values if there is a crack and is close to zero otherwise.
Recursive median filter and decimation

w[n] doesn't go to exactly 0 in the case there are no cracks in the input signal x[n] and the high pass signal w[n]. This is because there is a certain amount of background noise in w[n]. The level of background noise is estimated by taking the median of a large number of samples of w[n]. The best results are accomplished using a recursive median filter, defined as

y[n] = median(y[n-M], ..., y[n-1], x[n], x[n+1], ..., x[n+M])      (x)
where x[n] is the input of the filter and y[n] is the output. In order to reduce the computational complexity, the signal w[n] is decimated prior to the recursive median filter. Decimation means that only one of every K samples is considered. The estimated background level resulting from decimation and recursive median filtering is b[n].
Gate generator

The gate generator is essentially a comparator. The generator compares the signals w[n] and b[n] and if there is enough difference between them, the output is set to '1' (meaning enable); otherwise it is '0' (meaning disable).

            /            w[n] - b[n]
            |   1,  if   -----------  >  C
            |                b[n]
    g[n] = <                                     (x)
            |
            |   0,  otherwise
            \ 
where C is a parametric factor.

The output of the gate generator, g[n], controls the median filter (see figure 2). If the gate signal g[n] is 1, the median filter is enabled, and if g[n] is 0, the median filter is disabled, i.e. the signal x[n] is passed through unchanged.

Parameters

There are several variable parameters in the conditional median filter:

The conditional median filter is described in detail in Kasparis et al. [2].

Exercise 5.3.1: Implement a conditional median filter in MATLAB and filter the signals with it. Use segmentation. Start with the default parameter values as specified above but try to experiment with the parameters a bit eventually. You should start with vinyl1.wav, since there are evident cracks in the beginning of the file (i.e. it's more easy to denoise). Write some comments on the different parameters and their effects.

Exercise 5.3.2: How does the conditional median filter perform in removing the cracks in general?

MATLAB help:

Hint: the reference material describes this type of a filter. Studying the material carefully helps in implementation and commenting the results.


6 Wavelet denoising

Introduction

This section is intended more as a tour to the Wavelet Toolbox in MATLAB. Write comments on the performance of wavelets to the report.

Wavelet denoising is based on a principle similar to that of e.g. JPEG image compression: coefficient thresholding. The sound is first transformed with the Discrete Wavelet Transform (DWT) into a set of wavelet coefficients. Then the coefficients are inspected and a threshold value is decided. After this all the coefficients with values less than the threshold are set to zero. Then the inverse transform is carried out and we're done.

The wavelet transform needs you to decide the mother wavelet to use in the transforms. This will essentially be some kind of an elementary signal the forward transform will be looking for in the original signal. We will be using Daubechies wavelets for an example.

The tour

  1. type wavemenu at the MATLAB prompt
    (a Wavelet Toolbox Main Menu window opens)
  2. Select Wavelet Packet 1-D
    (a Wavelet Packets 1-D window opens)
  3. Select File/Load Signal: cd2.mat
  4. Select Wavelet e.g. db 2, Level 3
  5. Click Analyze
  6. Click De-noise
    (a De-noising window opens)
  7. Click De-noise
    (you can tune the parameters and click De-noise again)
  8. Select Close
  9. Answer `yes' to the question `Update the synthesized signal?'
  10. File/Save Synthesized Signal (e.g. as cd2let.mat)
You can then load cd2 and cd2let to MATLAB and hear them (help sound) or save them to WAV files (help wavwrite).

Exercise 6.1 How did the wavelet denoising perform? Evaluate the resulting quality subjectively. Compare the resulting quality to the other filters.


7 Conclusions

Answer the following questions

Exercise 7.1: How did the resulting music sound in each of the different filter algorithms? Were there cracks left in the filtered signal? Was the sound of the processed music colored somehow? How would you describe this distortion wrt. to nonprocessed and noncorrupted music?

Exercise 7.2: Compare the different filtered sounds with each other and make short comments if there are noticeable differences.

Exercise 7.3: Would cd2.wav sound at all like cd1.wav, after cd2.wav had been processed?

Exercise 7.4: What method was the best at removing the cracks without distorting the music in your opinion? Why?

Exercise 7.5: How would you improve/combine the presented methods? Ideally the filter would be run to whole LP records of music instead of just filtering the corrupted parts.


References

[1] Rabiner, L. et al. `Applications of a Nonlinear Smoothing Algorithm to Speech Processing', IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-23, No. 6, Dec 1975.
[2] Kasparis, T. et al., `Adaptive Scratch Noise Filtering', IEEE Transactions on Consumer Electronics, Vol. 39, No. 4, Nov 1993.

The references can be found on top of the "mailboxes" near room TC305 at Tietotalo. The material provides a view to techniques used in this type of audio signal processing. Please copy the material you need, and remember to return it where it was!