Tools


audio & pattern recognition

Introduction

This page tries to collect some of the essential audio related frameworks and libraries. The list does not try to be complete.

Frameworks

Name Language Type Licence Description URL
CLAM C++ Signal analysis GPL Analysis, synthesis and processing of audio signals http://clam-project.org
CMU Sphinx C Speech recognizer BSD http://cmusphinx.sourceforge.net
Julius C Speech recognizer Own Open-Source Large Vocabulary CSR Engine Julius http://julius.sourceforge.jp/en_index.php
HTK C Speech recognizer Own http://htk.eng.cam.ac.uk
Marsyas C++ Signal analysis GPL v2 Feature extraction, signal analysis, classification http://marsyasweb.appspot.com
VAMP C++ Signal analysis BSD Feature extraction, signal analysis through plugins http://vamp-plugins.org
WEKA Java Data mining GPL http://www.cs.waikato.ac.nz/ml/weka/
Essentia C++ Signal analysis AGPL v3 Feature extraction, signal analysis http://essentia.upf.edu
Kaldi C++ Speech recognizer Apache v2.0 Kaldi Speech Recognition Toolkit http://kaldi.sourceforge.net

Libraries

This list collects numerous open source libraries which can be used while building audio related software. List is mainly concentrating on libraries written in C/C++ or Python.

Name Language Type Licence Description URL
Aquila C++ DSP MIT Generators Windowing FFT Filtering Features http://aquila-dsp.org/
Armadillo C++ Math LGPL v3 Linear algebra library Matlab like notation http://arma.sourceforge.net/
BFilt C++ Machine learning GPL v3 Bayesian filtering http://code.google.com/p/bfilt/
dlib C++ Machine learning Boost http://dlib.net/
FFTW C DSP GPL FFT http://www.fftw.org/
FLANN C++ Machine learning BSD Fast approximate nearest neighbor searches in high dimensional spaces http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN
GHMM Python Machine learning LGPL The General Hidden Markov Model library http://ghmm.org/
GSL C Math GPL GNU Scientific Library numeric computation http://ghmm.org/
IT++ C++ Math GPL Linear algebra library http://itpp.sourceforge.net/current/
LAPACK C Math BSD Linear algebra library http://www.netlib.org/lapack/
LibLinear C Machine learning BSD Large Linear Classification http://www.csie.ntu.edu.tw/~cjlin/liblinear/
Libmfcc C DSP MIT Features (MFCC) http://code.google.com/p/libmfcc
LibNMF C Math - Non-negative Matrix Factorization (NMF) http://www.univie.ac.at/rlcta/software/
libsndfile C I/O LGPL v3 Audio file handling http://www.mega-nerd.com/libsndfile/
LibSVM C Machine learning BSD SVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/
LibXtract C DSP MIT Feature extraction https://github.com/jamiebullock/LibXtract
MLC++ C Machine learning Own Supervised Machine Learning http://www.sgi.com/tech/mlc/
mlpack C++ Machine learning LGPL v3 Scalable c++ machine learning library GMM and HMM http://www.mlpack.org/
nimfa Python DSP GPL v3 Nonnegative Matrix Factorization Techniques http://nimfa.biolab.si/
openBliSSART C++ DSP GPL Blind Source Separation for Audio Recognition Tasks, NMF, SVM http://openblissart.github.com/openBliSSART/
openSMILE C++ DSP GPL Audio Feature Extractor. Audio I/O Windowing, FFT, Filtering, Features http://opensmile.sourceforge.net/
PLearn C++ Machine learning BSD http://opensmile.sourceforge.net/
PortAudio C I/O MIT Audio I/O http://www.portaudio.com/
SHARK C++ Machine learning GPL v3 General machine learning library http://shark-project.sourceforge.net/
SRC C DSP GPL Secret Rabbit Code (SRC) sample rate conversio http://www.mega-nerd.com/SRC/
STK C++ DSP MIT Synthesis https://ccrma.stanford.edu/software/stk/
Torch C++ Machine learning BSD http://torch5.sourceforge.net/
Waffles C++ Machine learning LGPL Command-line tools for machine learning and data mining http://waffles.sourceforge.net/
Yaafe C++ DSP LGPL v3 Feature extraction http://yaafe.sourceforge.net/
FFTreal C++ DSP WTFPL FFT & IFFT http://ldesoras.free.fr/prod.html#src_fftreal
Kiss FFT C DSP BSD FFT & IFFT http://kissfft.sourceforge.net/
General purpose FFT C DSP Own FFT http://www.kurims.kyoto-u.ac.jp/~ooura/fft.html
scikit-learn Python Machine learning BSD Machine learning algorithms http://scikit-learn.org/stable/
NMF-CUDA CUDA DSP Own CUDA implementation of non-negative matrix factorization for GPUs https://github.com/ebattenberg/nmf-cuda
NMF C++ DSP Own Nonnegative Matrix Factorization (Approximation) http://people.kyb.tuebingen.mpg.de/suvrit/work/progs/nnma.html
NMF framework for dimensionality reduction and unsupervised clustering C++ DSP Creative Commons Non-negative matrix factorization framework for dimensionality reduction and unsupervised clustering http://www.insight-journal.org/browse/publication/152
DSPFilters C++ DSP MIT Collection of useful C++ classes for digital signal processing https://github.com/vinniefalco/DSPFilters
Essentia C++ Signal analysis AGPL v3 Feature extraction, signal analysis http://essentia.upf.edu/
python_speech_features Python DSP MIT Feature extraction (MFCC) https://github.com/jameslyons/python_speech_features
simple-minded audio classifier Python DSP Own Audio classifier in python (using MFCC and GMM) https://github.com/danstowell/smacpy
Librosa Python Signal analysis ISC Feature extraction, signal analysis https://github.com/librosa/librosa
Keras Python Machine learning MIT Deep Learning library for Theano and TensorFlow http://keras.io/
Theano Python Math Own Define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently (CPU / GPU) https://github.com/Theano/Theano
CUDAMAT Python Machine learning Own GPU accelerated GMM training https://github.com/cudamat/cudamat
ggmm Python Machine learning Own Python module to train GMMs using CUDA (via CUDAMat) https://github.com/ebattenberg/ggmm
hmmlearn Python Machine learning Own Hidden Markov Models in Python, with scikit-learn like API https://github.com/hmmlearn/hmmlearn
TensorFlow Python & C++ Machine learning Apache 2.0 https://www.tensorflow.org/

Software licences

Below is a short description about the most commonly used licenses and their implications to usage.

MIT

Wikipedia, Plain English

Description

Permissive license permitting reusing within proprietary software as long as all copied include copy of MIT License terms.

Allows

  • Commercial use
  • Modification
  • Distribution

BSD

Wikipedia, Plain English

Description

Allows high level of freedom as long as one includes BSD copyright notice along the software.

Allows

  • Commercial use
  • Modification
  • Distribution

Obligates

  • Include BSD copyright notice

GPL

Wikipedia, Plain English

Description

Most widely used software license.

Allows

  • Commercial use
  • Modification
  • Distribution

Obligates

  • Source code has to be released when software is distributed.

LGPL

Wikipedia, Plain English

Description

Allows integrate LGPL licensed parts into proprietary software without requirement to release the source code of the proprietary parts of the software.

Allows

  • Commercial use
  • Modification
  • Distribution

Obligates

  • Original LGPL licensed code has to be distributed with the software.
  • The LGPL licensed library has to be statically linked while compiling.

Links