A clustering genetic algorithm for genomic data mining

In this chapter we summarize our work toward developing clustering algorithms based on evolutionary computing and its application to genomic data mining. We have focused on the reconstruction of protein-protein functional interactions from genomic data. The discovery of functional modules of proteins is formulated as an optimization problem in which proteins with similar genomic attributes are grouped together. By considering gene co-occurrence, gene directionality and gene proximity, clustering genetic algorithms can predict functional associations accurately. Moreover, clustering genetic algorithms eliminate the need for the a priori specification of clustering parameters (e. g. number of clusters, initial position of centroids, etc.). Several methods for the reconstruction of protein interactions are described, including single-objective and multi-objective clustering genetic algorithms. We present our preliminary results on the reconstruction of bacterial operons and protein associations as specified by the DIP and ECOCYC databases. © 2009 Springer-Verlag Berlin Heidelberg.

A clustering genetic algorithm for genomic data mining Chapter in Scopus