Plenary Talk #1
Raffaele Giancarlo: Dipartimento di Matematica ed Informatica Universita’ degli Studi di Palermo Italy
Title:The Three Steps of Clustering in the Post-Genomic Era* *Joint work with G. Lo Bosco, L. Pinello and F. Utro -------------- Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. It can be summarized as a three step process: (a) Choice of a Distance Function; (b) Choice of a Clustering Algorithm; (c) Choice of a Validation method. Although such a purist approach to Clustering is hardly seen in many areas of Science, genomic data require that level of attention if inferences made from Cluster Analysis have to be of some relevance to Biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes Cluster Analysis of genomic data particularly difficult. In this talk, the state of the art on the subject will be presented, discussing specific limitations of the steps involved in Clustering and possible ways to make progress.
|
Plenary Talk #2
Paulo J. Lisboa, John Moores University, Liverpool, UK
Title: The continuum from bioinformatics to biostatistics* *Joint work with D. Bacciu, I.H. Jarman, T.A. Etchells, S.J. Chambers, J. Whittaker and J. Garibaldi -------------- The elucidation of biological networks regulating the metabolic basis of disease is critical for understanding disease progression and identifying therapeutic targets. This paper will highlight the need for multidisciplinary research across computational intelligence methods andtraditional statistics, by reference to a data set of cytometric protein expression markers for breast cancer. In particular, it will focus on the interplay between robust clustering, visualisation by dimensionality reduction and modelling with directed acyclic graphs. |
Plenary Talk #3
Gianluca Pollastri:
School of Computer Science and Informatics University College Dublin. Ireland.
Title: De Novo Protein Subcellular Localization Prediction by N-to-1 Neural Networks
--------------
Knowledge of the subcellular location of a protein provides valuable information about its function and possible interaction with other proteins. In the post-genomic era, fast and accurate predictors of subcellular location are required if this abundance of sequence data is to be fully exploited. We have developed a subcellular location predictor (SCL_pred) using high throughput machine learning models trained on large non-redundant sets of protein sequences. The algorithm powering SCL_pred is a new Neural Network (N-to-1 Neural Network, or N1-NN) which is capable of mapping whole sequences into single properties (a functional class, in this work) without resorting to predefined transformations, but rather by adaptively compressing the sequence into a hidden feature vector. I will describe the model, and report on extensive benchmarking of SCL_pred against other state-of-the-art predictors of subcellular location. The results are favourable, moreover the N1-NN algorithm is fully general and may be applied to a host of problems of similar shape, that is, in which a whole sequence needs to be mapped into a fixed-size array of properties. The adaptive compression operated by N1-NN may even shed light on the space of protein sequences.
|
|
|