This course will introduce algorithms and data representations for music content analysis and for automatic speech recognition. Also environmental audio will be discussed.
The responsible for the course is Annamaria Mesaros (firstname.lastname @ tut.fi).
PROJECT ASSIGNMENT information:
Lecture slides will be added here pdf format after each lecture.
Exercises are not obligatory, but by doing them you can get bonus points to the exam (see below).
We have two different slots reserved in the same day, select one to attend throughout the teaching period.
Two hours per week:
Exercise questions/problems will be published here every Friday before the exercise session. Exercises:
Exercises consist of math and Matlab problems related to the lectures. Also the project work will be discussed during the exercises.
Attending the exercises and working out the math problems
in advance is highly recommended, but not obligatory.
By completing math problems in advance (usually one per
exercise sheet) and by active participation in the exercise sessions
(Matlab problems are solved during the session), you get bonus points
to the exam as follows:
25% completed (6 exercise points) -> 1 bonus point
50% completed (12 exercise points) -> 2 bonus points
75% completed (18 exercise points) -> 3 bonus points (worth one mark
in exam)
The solutions should be presented at the exercise sessions. In exceptional situations where this is not possible, agree with the exercise assistant on the presentation method.
Exercise assistant: Thomas Barker.
Project work is required for passing the course. The work consists in implementing an algorithm related to the course material and writing a report about it.
The topic for the project work is the same for all students, and will be announced later. Project is done in groups of two persons, if possible.
The project work will be marked only as pass/fail.
Slides with instruction for the project work.
Separately recorded tracks that you can use as a ground truth for evaluation: here. Also feel free to use your own audio and audio from the exercises (esp. exercise 4). Note that it is not fair to use the exactly same originally separate target signal for training. Instead, use either a segment of the mixture, in which only the target signal or only the background is present, or use some totally different signal.
If you cannot find a partner, you can also do the project alone.
Deadline: 1.5.2017SGN-14006 Audio and Speech Processing.
SGN-13006 Introduction to Pattern Recognition and Machine Learning
(or SGN-13000 Johdatus hahmontunnistukseen ja koneoppimiseen)
is useful, but not necessary.