Cluster validation using an ensemble of supervised classifiers
                 
        Academic Article in Scopus
                    
                
        
            
    
    
     
        
    
         
     
    
    -  
 
            - Overview
 
            -  
 
            - Identity
 
            -  
 
            - Additional document info
 
            -  
 
    - View All
 
    -  
 
        
        
            Overview
        
            
                    abstract   
                
    - 
    	© 2018A cluster validity index is used to select which clustering algorithm to apply for a given problem. It works by evaluating the quality of a partition, as output by a candidate clustering algorithm, getting around the common case of the lack of an expert in the given domain of discourse. Most existing validity indexes make assumptions, such as each cluster of the partition having an underlying structure, for example, a hypersphere, yielding incorrect evaluations when they do not hold. Here, we propose a new cluster validity index, which attempts to avoid this bias using an ensemble of distinct supervised classifiers; this way the bias is not attributable to a specific classifier, but to a collection thereof, hence alleviating the problem. The rationale behind our index is that a good partition should induce the construction of also a good classifier; the better the classification performance, the better the quality of the partition under evaluation. Notice how we use the partition to be assessed as a sort of labeled dataset, where each object is labeled with the cluster label it belongs to. We have tested our index on 50 numerical datasets, grouped using six different clustering algorithms. In our experiments, our index outperforms five validity indexes, including the most popular ones. 
    
 
                
             
            
                    
                
             
            
                    status   
                
             
            
                    publication date   
                
             
            
                    published in   
                
             
         
         
        
        
            Identity
        
            
                    Digital Object Identifier (DOI)   
                
             
         
         
        
        
            Additional document info
        
            
                    has global citation frequency   
                
             
            
                    start page   
                
             
            
                    end page   
                
             
            
                    volume