Welcome!
Welcome! My name is Timo Erkkilä. I'm a computer scientist working in life sciences developing scalable analysis tools for large, heterogeneous
data sets. I work as a researcher at
Tampere University of Technology
(TUT), Tampere, Finland, in the
group of Computational Systems Biology. I'm also a PhD student at
Finnish Graduate School in Computational Sciences
(FICS), Helsinki, Finland, and now I'm pursuing for a doctoral degree under the supervision of professors
Harri Lähdesmäki and
Olli Yli-Harja. I'm also affiliated to
Schmulevich Group in
Institute for Systems Biology, with whom I collaborate in
The Cancer Genome Atlas.
Look me up on
Facebook,
Flickr, and
LinkedIn.
Web Design
Various CSS style templates:
here
Regulome Explorer
Regulome Explorer is a web application for visualization and exploration of multivariate statistical associations, inferred from a wide range of biological data. Currently the application is demonstrating how such diverse data as offered by The Cancer Genome Atlas can efficiently be explored:
RF-ACE
Present-day systems biology studies often involve large quantities of mixed type data (categorical and numerical variables) with missing values. These characteristics severely limit the modeling possibilities one can consider when trying to uncover multivariate associations therein. The same applies to many other real world data analysis problems, where data sources are diverse and contain tens or even hundreds of thousands of candidate explanatory features. RF-ACE is an efficient feature selection algorithm best suited to aforementioned problems. It consists of a series of decision tree based feature selection procedures and statistical testing framework to accurately preform dimensionality reduction, identify statistically significant correlates, and predict data.
RF-ACE Google Code Project
RF-ACE Google Groups
Mergedata
mergedata is a generic program for merging data tables with row and column headers into a single data table. mergedata makes no assumptions on the order nor overlap of the data in the input data tables. However, rows and columns in the merged table appear in the input order:
Mergedata Google Code Project
DSection
DSection is a model for reconstructing cell type specific gene expression profiles from measurements of heterogeneous tissues. That is, no manual purification is needed, but the user must also provide information about cell type proportions in each tissue sample. Furthermore, samples can be categorized to experimental conditions -- the model then makes predictions on expression profiles across both cell types and experimental conditions.
Matlab scripts:
DSection.m (main program)
MCMCSummary.m
visualizeMCData.m
Curriculum Vitae
Check out my
LinkedIn Profile for the most up-to-date Curriculum Vitae.
Publications
Invited talks
Erkkilä et al., "RF-ACE for Uncovering Nonlinear Associations from Heterogeneous Cancer Data", The Cancer Genome Atlas' 1st Annual Scientific Symposium, Maryland, USA, September 2011
[
slides][
talk]
Erkkilä et al., "Integrative Exploration of Cancer Data Reveals Regulatory Hotspots", WCSB, Zurich, Switzerland, June 2011
[
slides]
Erkkilä et al., "Inferring genetic regulatory interactions from time-collapsed Boolean summary variables", WCSB, Luxembourg, June 2010
Journal articles
Erkkilä et al., "Probabilistic analysis of gene expression measurements from heterogeneous tissues", Bioinformatics, 2010
[
article]
Dai et al., "A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data", BMC Bioinformatics, 2009
Conference articles
Erkkilä et al., "A generic class density estimation framework using Dirichlet processes", WCSB, Aarhus, Denmark, 2009
Nikkilä et al., "Decomposing Gene Expression into Regulatory and Differential Parts with Bayesian Data Fusion", WCSB, Leipzig, 2008
Ruusuvuori et al., "Efficient automated method for image-based classification of microbial cells", ICPR, Tampa, Florida, USA, 2008