The global Minmax k-means algorithm
- Xiaoyan Wang^{1} and
- Yanping Bai^{2}Email author
Received: 7 June 2016
Accepted: 19 September 2016
Published: 27 September 2016
Abstract
The global k-means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k-means to minimize the sum of the intra-cluster variances. However the global k-means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, after a bad initialization, poor local optimal can be easily obtained by k-means algorithm. In this paper, we modified the global k-means algorithm to eliminate the singleton clusters at first, and then we apply MinMax k-means clustering error method to global k-means algorithm to overcome the effect of bad initialization, proposed the global Minmax k-means algorithm. The proposed clustering method is tested on some popular data sets and compared to the k-means algorithm, the global k-means algorithm and the MinMax k-means algorithm. The experiment results show our proposed algorithm outperforms other algorithms mentioned in the paper.
Keywords
Background
Clustering is one of classic problems in pattern recognition, image processing, machine learning and statistics (Xu and Wunsch 2005; Jain 2010; Berkhin 2006). Its aim is to partition a collection of patterns into disjoint clusters, such that patterns in the same cluster are similar, however patterns belonging to two different clusters are dissimilar.
One of the most popular clustering method is k-means algorithm, where clusters are identified by minimizing the clustering error. Despite its popularity, the k-means algorithm is sensitive to the choice of initial starting conditions (Celebi et al. 2013; Peña et al. 1999; Celebi and Kingravi 2012, 2014). To deal with this problem, the global k-means algorithm has been proposed (Likas et al. 2003), and then some of its modifications (Bagirov 2008; Bagirov et al. 2011) are proposed. Even an extension to kernel space has been developed (Tzortzis and Likas 2008, 2009). A fuzzy clustering version is also available (Zang et al. 2014). All of these are incremental approaches that start from one cluster and at each step a new cluster is deterministically added to the solution according to an appropriate criterion. Using this method also can learn the number of data clusters (Kalogeratos and Likas 2012). Although the global k-means algorithm is deterministic and often performs well, but sometimes the new cluster center may be a outlier, then it may arise that some of the clusters just have single point, the result is awful. Another way to avoid the choice of initial starting conditions is to use the multi restarting k-means algorithm (Murty et al. 1999; Arthur and Vassilvitskii 2007; Banerjee and Ghosh 2004). A new version of this method is the MinMax k-means clustering algorithm (Tzortzis and Likas 2014), which starts from a randomly picked set of cluster centers and tries to minimize the maximum intra-cluster error. Its application (Eslamnezhad and Varjani 2014) shows that the algorithm is efficient in intrusion detection.
In this paper, a new version of modified global k-means algorithms is proposed in order to avoid the singleton clusters. In addition, the initial positions chosen by the global k-means algorithms sometimes are bad, after a bad initialization, poor local optimal can be easily obtained by k-means algorithm. Therefore we employ the MinMax k-means clustering error method instead of k-means clustering error in global k-means algorithm to tackle this problem, obtain a deterministic algorithm called the global Minmax k-means algorithm. We do loads of experiments on different data sets, the results show that our proposed algorithm is better than other algorithms which referred in the paper.
The rest of paper is organized as follows. We briefly describe the k-means, the global k-means and the MinMax k-means algorithms in “Preliminaries” section. In “The proposed algorithm” section we proposed our algorithms. Experimental evaluation is presented in “Experiment evaluation” section. Finally “Conclusions” section conclude our work.
Preliminaries
k-Means algorithm
Generally, we call \(\sum \nolimits _{k=1}^{M}I(x_i\in C_k)\Vert x_i-m_k\Vert ^2\) intra-cluster error(variance). Obviously, clustering error is the sum of intra-cluster error. Therefore, we use \(E_{sum}\) instead of \(E(m_1,m_2,\ldots ,m_M)\) in briefly, i.e. \(E_{sum}=E(m_1,m_2,\ldots ,m_M)\).
The k-means algorithm finds locally optimal solutions with respect to the clustering error. The main disadvantage of the method is its sensitivity to initial position of the cluster center.
The global k-means algorithm
To deal with the initialization problem, the global k-means has been proposed, which is an incremental deterministic algorithm that employs k-means as a local search procedure. This algorithm obtains optimal or near-optimal solutions in terms of clustering error.
In order to solve a clustering problem with M clusters, Likas et al. (2003) provided the proceeds as follows. The algorithm starts with one cluster \((k=1)\) and find its optimal position which corresponds to the data set centroid. To solve the problem with two clusters \((k=2)\) they run k-means algorithm N (N is the size of the data set) times, each time starting with the following initial positions of the cluster centers: the first cluster center is always placed at the optimal position for the problem with \(k=1\), and the other at execution n is placed at the position of the data point \(x_n(n=1,2,\ldots ,N)\). The solution with the lowest cluster error is kept as the solution of the 2-clustering problem. In general, let \((m_1^*,m_2^*,\ldots ,m_k^*)\) denote the final solution for k-clustering problem. Once they find the solution for the \((k-1)\)-clustering problem, they try to find the solution of the k-clustering problem as follows: they perform N executions of the k-means algorithm with \((m_1^*,m_2^*,\ldots ,m_{(k-1)}^*,x_n)\) as initial cluster centers for the \(n\hbox {th}\) run, and keep the solution resulting in the lowest clustering error. By proceeding in the above fashion they finally obtain a solution with M clusters and also found solutions for all k-clustering problems with \(k<M\).
The quantity \(b_n\) measures the reduction in the error measure obtained by inserting a new cluster center at point \(x_n\). It is clear that a data point \(x_n\in R^d\) with the largest value of the \(b_n\) is the best candidate to be a starting point for the kth cluster center. Therefore, we compute \(i=\arg \max \nolimits _{n} b_n\) and find the data point \(x_n\in R^d\) such that \(b_n=i\). This data point is selected as a starting point for the kth cluster center.
The MinMax k-means algorithm
The proposed algorithm
The modified global k-means algorithm
As we known, the global k-means algorithm may obtain singleton clusters if the initial centers are outliers. To avoid this, we propose the Modified global k-means algorithm.
Algorithm 1: The Modified global k-means Algorithm 1.
Step 2 (Stopping criterion) Set \(k=k+1\). If \(k>M\), then stop;
Step 3 Take the centers \(m_1,m_2,\ldots ,m_{k-1}\) from the previous iteration and consider each point \(x_i\) of X as a starting point for the kth cluster center, thus obtain N initial solutions with k points \((m_1,m_2,\ldots ,m_{k-1},x_i)\);
Step 4 Apply the k-means algorithm to each of them; keep the best k-partition obtained and its centers \(y_1,y_2,\ldots ,y_k\);
Step 5 (Detect the singleton clusters) If the obtained clusters exist singleton cluster, then delete the point \(y_k\) in candidate initial center X, and go to step 3, else go to step 6;
Step 6 Set \(m_i=y_i,\,i=1,2,\ldots ,k\,\) and go to step2.
Due to high computational cost of the global k-means algorithm, we propose the fast algorithm. It is based on the idea as the fast global k-means variant proposed in Peña et al. (1999).
Algorithm 2: The Modified global k-means Algorithm 2.
The steps 1, 2, 6 are same to the Algorithm 1.
Steps 3, 4, 5 is modified as follows:
Step 3′ Take the centers \(m_1,m_2,\ldots ,m_{k-1}\) from the previous iteration and consider each point \(x_i\) of X as a starting point for the kth cluster center, then calculate \(b_i\) using Eq. (2), choose the corresponding starting point of maximum \(b_i\) as the best solution;
Step 4′ Apply the k-means algorithm to the best solution; keep the best k-partition obtained and its centers \(y_1,y_2,\ldots ,y_k\);
Step 5′ (Detect the singleton clusters) If the obtained clusters exist singleton cluster \(b_i\), then let \(b_i=0\), and go to step 3, else go to step 6;
In our numerical experiments we use Algorithm 2.
Comparative results
Method | Clusters | \(E_{sum}\) | Number of each cluster |
---|---|---|---|
Global k-means | 4 | 1.0e+04 \(\times \) 4.9175 | (25, 14, 1, 1) |
Modified global k-means | 4 | 1.0e+04 \(\times \) 4.0718 | (12, 14, 13, 2) |
Table 1 shows when we partition the data for four clusters, there are two clusters just include one element in the global k-means algorithm, i.e. there are two singleton clusters in the global k-means algorithm. We also find that the \(E_{sum}\) of modified global k-means is more lower than that of global k-means.
The global Minmax k-means algorithm
The global Minmax k-means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable positions like the global k-means algorithm, and this procedure was introduced in preliminaries. After choose the initial center, we employ the MinMax k-means method to minimize the maximum intra-cluster variances. The MinMax k-means algorithm was described in preliminaries. The whole method of the proposed algorithm is illustrated as Algorithm 3.
Algorithm 3: The global Minmax k-means algorithm.
Step 1 (Initialization) Compute the centroid \(m_1\) of the set X, using (7).
Step 2 (Stopping criterion) Set \(k=k+1\). If \(k>M\), then stop;
Step 3 Take the centers \(m_1,m_2,\ldots ,m_{k-1}\) from the previous iteration and consider each point \(x_i\) of X as a starting point for the kth cluster center, thus obtaining N initial solutions with k points \((m_1,m_2,\ldots ,m_{k-1},x_i)\);
Step 4 Apply the MinMax k-means algorithm to each of them; keep the best k-partition obtained and its centers \(y_1,y_2,\ldots ,y_k\);
Step 5 (Detect the singleton clusters) If the obtained clusters exist singleton cluster, then the candidate initial center delete the point \(y_k\), and go to step 3, else go to step 6;
Step 6 Set \(m_i=y_i,\,i=1,2,\ldots ,k\,\) and go to step 2.
Experiment evaluation
Comparative results on \(S_1\) data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 28.4856 | 96.6753 |
Global k-means | 25.3388 | 93.7457 |
MinMax k-means (\(\beta =0.3\)) | 25.3388 | 93.7457 |
MinMax k-means (\(\beta =0.1\)) | 25.3388 | 93.7457 |
MinMax k-means (\(\beta =0\)) | 25.3388 | 93.7457 |
Global Minmax k-means (\(\beta =0.3\)) | 25.3388 | 93.7457 |
Global Minmax k-means (\(\beta =0.1\)) | 25.3388 | 93.7457 |
Global Minmax k-means (\(\beta =0\)) | 25.3388 | 93.7457 |
Comparative results on \(S_2\) data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 52.0518 | 197.4535 |
Global k-means | 52.0518 | 197.4535 |
MinMax k-means (\(\beta =0.3\)) | 52.0518 | 197.4535 |
MinMax k-means (\(\beta =0.1\)) | 52.0518 | 197.4535 |
MinMax k-means (\(\beta =0\)) | 52.0518 | 197.4535 |
Global Minmax k-means (\(\beta =0.3\)) | 52.0518 | 197.4535 |
Global Minmax k-means (\(\beta =0.1\)) | 52.0518 | 197.4535 |
Global Minmax k-means (\(\beta =0\)) | 52.0518 | 197.4535 |
Synthetic data sets
Real-world data sets
Coil-20 is a data set (Nene et al. 1996), which contains 72 images taken from different angels for each of the 20 included objects. We used three subsets Coil15, Coil8, Coil19, with images from 15, 18 and 19 objects, respectively, as the data set in Tzortzis and Likas (2014). The data set includes 216 instances and each of the data has 1000 features.
Iris(UCI) (Frank and Asuncion 2010) is a famous data set which created by R.A. Fisher. There are 150 instances and 50 in each of three classes. Each data has four predictive attributes.
Seeds(UCI) (Frank and Asuncion 2010) is composed of 210 records that extract from three different varieties of wheat. The number of each grain is equal and each grain is described by seven features.
Yeast(UCI) (Frank and Asuncion 2010) includes 1484 instances about the cellular localization sites of proteins and eight attributes. Proteins belong to ten categories. Five of the classes are extremely under represented and are not considered in our evaluation. The data set is unbalanced.
Pendigits(UCI) (Frank and Asuncion 2010) includes 10,992 instances of handwritten digits (0–9) from the UCI repository (Eslamnezhad and Varjani 2014), and 16 attributes. The data set is almost balanced.
User Knowledge Modeling (UCI) (Frank and Asuncion 2010) is about the students’ knowledge status about the subject of Electrical DC Machines. User Knowledge Modeling includes 403 instances with 6-dimensional space. The data set is unbalanced. The students are assessed four levels.
In the experiment, the sample data of Iris, Seeds and Pendigits data set will be normalized using z-score method firstly and the algorithm will be implemented on the normalized data.
A summary of the data sets is provided in Table 4.
Performance analysis
Comparative results on \(S_3\) data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 90.8431 | 329.4181 |
Global k-means | 90.8431 | 329.4133 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 87.1170 | 329.6677 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 87.1170 | 329.6677 |
MinMax k-means (\(p=0.5,\beta =0\)) | 87.1170 | 329.6352 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 88.4824 | 329.4766 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 88.4824 | 329.4766 |
MinMax k-means (\(p=0.3,\beta =0\)) | 88.5052 | 329.4761 |
MinMax k-means (\(p=0.1,\beta =0.3\)) | 89.6205 | 329.4349 |
MinMax k-means (\(p=0.1,\beta =0.1\)) | 89.5976 | 329.4351 |
MinMax k-means (\(p=0.1,\beta =0\)) | 89.6203 | 329.4346 |
MinMax k-means (\(p=0,\beta =0\)) | 90.8430 | 329.4181 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 87.1170 | 329.6677 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 87.1170 | 329.6677 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 87.1170 | 329.6352 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 87.1170 | 329.5055 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 87.1170 | 329.5055 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 87.1170 | 329.5055 |
Global Minmax k-means (\(p=0.1,\beta =0.3\)) | 88.5673 | 329.4616 |
Global Minmax k-means (\(p=0.1,\beta =0.1\)) | 88.5673 | 329.4616 |
Global Minmax k-means (\(p=0.1,\beta =0\)) | 88.5673 | 329.4616 |
Global Minmax k-means (\(p=0,\beta =0\)) | 90.8431 | 329.4133 |
Comparative results on \(S_4\) data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 68.0815 | 110.6536 |
Global k-means | 62.5878 | 105.5999 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 54.0427 | 109.0927 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 54.0427 | 109.0927 |
MinMax k-means (\(p=0.5,\beta =0\)) | 54.0464 | 109.1226 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 57.3660 | 106.6937 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 57.3660 | 106.6937 |
MinMax k-means (\(p=0.3,\beta =0\)) | 57.3660 | 106.6937 |
MinMax k-means (\(p=0.1,\beta =0.3\)) | 61.0903 | 105.6490 |
MinMax k-means (\(p=0.1,\beta =0.1\)) | 61.0903 | 105.6490 |
MinMax k-means (\(p=0.1,\beta =0\)) | 61.0903 | 105.6490 |
MinMax k-means (\(p=0,\beta =0\)) | 68.0815 | 110.6536 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 54.0427 | 109.0927 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 54.0464 | 109.1226 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 54.0464 | 109.1226 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 57.3660 | 106.6937 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 57.3660 | 106.6937 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 57.3660 | 106.6937 |
Global Minmax k-means (\(p=0.1,\beta =0.3\)) | 61.0903 | 105.6490 |
Global Minmax k-means (\(p=0.1,\beta =0.1\)) | 61.0903 | 105.6490 |
Global Minmax k-means (\(p=0.1,\beta =0\)) | 61.0903 | 105.6490 |
Global Minmax k-means (\(p=0,\beta =0\)) | 62.5878 | 105.5999 |
The brief description of the real data sets
Data set | Instances | Attributes | Classes | Balanced |
---|---|---|---|---|
Coil2 | 216 | 1000 | 3 | Yes |
Iris | 150 | 4 | 3 | Yes |
Seeds | 210 | 7 | 3 | Yes |
Yeast | 1350 | 8 | 5 | No |
Pendigits | 10,992 | 16 | 10 | Almost |
User knowledge modeling | 403 | 6 | 4 | No |
Comparative results on the Coil2 data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 79.0141 | 155.6635 |
Global k-means | 105.2087 | 154.8112 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 58.7115 | 154.6850 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 57.1880 | 155.1839 |
MinMax k-means (\(p=0.5,\beta =0\)) | 58.7317 | 154.5164 |
MinMax k-means (\(p=0.4,\beta =0.3\)) | 58.8274 | 154.5812 |
MinMax k-means (\(p=0.4,\beta =0.1\)) | 58.8519 | 154.5189 |
MinMax k-means (\(p=0.4,\beta =0\)) | 58.8205 | 154.4097 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 58.9824 | 154.5769 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 58.9544 | 154.5170 |
MinMax k-means (\(p=0.3,\beta =0\)) | 58.9147 | 154.4083 |
MinMax k-means (\(p=0.2,\beta =0\)) | 59.1028 | 154.4047 |
MinMax k-means (\(p=0.1,\beta =0\)) | 68.6188 | 154.6814 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 56.9899 | 157.7988 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 56.9899 | 157.7988 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 57.7296 | 157.4811 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 60.5913 | 157.1706 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 60.8388 | 157.3204 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 60.8388 | 157.3204 |
Global Minmax k-means (\(p=0.05,\beta =0.3\)) | 102.5301 | 154.7850 |
Global Minmax k-means (\(p=0.05,\beta =0.1\)) | 102.5301 | 154.7850 |
Global Minmax k-means (\(p=0.05,\beta =0\)) | 102.5301 | 154.7850 |
Global Minmax k-means (\(p=0.02,\beta =0.3\)) | 103.4904 | 154.7737 |
Global Minmax k-means (\(p=0.02,\beta =0\)) | 103.4904 | 154.7737 |
Comparative results on the Iris data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 67.3007 | 147.2335 |
Global k-means | 57.1672 | 139.9622 |
MinMax k-means (\(\beta =0.3\)) | 47.4502 | 138.8884 |
MinMax k-means (\(\beta =0.1\)) | 47.4502 | 138.8884 |
MinMax k-means (\(\beta =0\)) | 47.4502 | 138.8884 |
Global Minmax k-means (\(\beta =0.3\)) | 47.4502 | 138.8884 |
Global Minmax k-means (\(\beta =0.1\)) | 47.4502 | 138.8884 |
Global Minmax k-means (\(\beta =0\)) | 47.4502 | 138.8884 |
Comparative results on the Seeds data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 151.0572 | 428.7954 |
global k-means | 144.5954 | 428.6082 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 144.5954 | 428.6082 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 144.6353 | 428.7769 |
MinMax k-means (\(p=0.5,\beta =0\)) | 144.6353 | 428.7769 |
MinMax k-means (\(p=0.4,\beta =0.3\)) | 145.3806 | 428.6408 |
MinMax k-means (\(p=0.4,\beta =0.1\)) | 145.3806 | 428.6408 |
MinMax k-means (\(p=0.4,\beta =0\)) | 145.3806 | 428.6408 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 145.3806 | 428.6408 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 145.3806 | 428.6408 |
MinMax k-means (\(p=0.3,\beta =0\)) | 145.3806 | 428.6408 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 144.5954 | 428.6082 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 144.6880 | 429.0006 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 144.6880 | 429.0006 |
Global Minmax k-means (\(p=0.4,\beta =0.3\)) | 146.4214 | 428.6840 |
Global Minmax k-means (\(p=0.4,\beta =0.1\)) | 146.4214 | 428.6840 |
Global Minmax k-means (\(p=0.4,\beta =0\)) | 146.4214 | 428.6840 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 146.4214 | 428.6840 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 146.4214 | 428.6840 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 146.4214 | 428.6840 |
Comparative results on the Yeast data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 13.5325 | 51.4444 |
Global k-means | 13.4129 | 50.9959 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 14.2165 | 52.7943 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 22.6182 | 59.2278 |
MinMax k-means (\(p=0.5,\beta =0\)) | 12.6324 | 51.7455 |
MinMax k-means (\(p=0.4,\beta =0.3\)) | 11.1771 | 51.4789 |
MinMax k-means (\(p=0.4,\beta =0.1\)) | 17.5689 | 54.6692 |
MinMax k-means (\(p=0.4,\beta =0\)) | 12.6495 | 51.7366 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 11.3333 | 51.3884 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 11.6825 | 51.4354 |
MinMax k-means (\(p=0.3,\beta =0\)) | 12.5912 | 51.7159 |
MinMax k-means (\(p=0.1,\beta =0.3\)) | 12.6833 | 51.4565 |
MinMax k-means (\(p=0.1,\beta =0.1\)) | 12.6655 | 51.4575 |
MinMax k-means (\(p=0.1,\beta =0\)) | 12.6351 | 51.4379 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 11.1427 | 51.3872 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 21.2196 | 64.6526 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 17.1350 | 53.5700 |
Global Minmax k-means (\(p=0.4,\beta =0.3\)) | 11.3387 | 51.3334 |
Global Minmax k-means (\(p=0.4,\beta =0.1\)) | 10.9260 | 51.3190 |
Global Minmax k-means (\(p=0.4,\beta =0\)) | 22.5238 | 53.2086 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 11.8178 | 51.2643 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 11.8837 | 51.2450 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 22.5238 | 53.2086 |
Global Minmax k-means (\(p=0.2,\beta =0.3\)) | 12.2198 | 51.1261 |
Global Minmax k-means (\(p=0.2,\beta =0.1\)) | 12.2198 | 51.1261 |
Global Minmax k-means (\(p=0.2,\beta =0\)) | 12.1166 | 51.1379 |
Global Minmax k-means (\(p=0.1,\beta =0.3\)) | 16.0342 | 53.6899 |
Global Minmax k-means (\(p=0.1,\beta =0.1\)) | 16.0342 | 53.6899 |
Global Minmax k-means (\(p=0.1,\beta =0\)) | 16.0179 | 53.6955 |
Comparative results on the Pendigit data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 11,540 | 60,963 |
Global k-means | 12,549 | 59,643 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 8510 | 62,094 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 16,826 | 71,546 |
MinMax k-means (\(p=0.5,\beta =0\)) | 7744 | 61,116 |
MinMax k-means (\(p=0.4,\beta =0.3\)) | 7609 | 61,184 |
MinMax k-means (\(p=0.4,\beta =0.1\)) | 10,394 | 63,285 |
MinMax k-means (\(p=0.4,\beta =0\)) | 7740 | 61,100 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 7948 | 60,993 |
MinMax k-means (\(p=0.3,\beta =0.1\)) | 7918 | 60,993 |
MinMax k-means (\(p=0.3,\beta =0\)) | 7924 | 60,994 |
MinMax k-means (\(p=0.2,\beta =0.3\)) | 8854 | 60,825 |
MinMax k-means (\(p=0.2,\beta =0.1\)) | 8824 | 60,823 |
MinMax k-means (\(p=0.2,\beta =0\)) | 8854 | 60,825 |
MinMax k-means (\(p=0.1,\beta =0.3\)) | 9630 | 60,753 |
MinMax k-means (\(p=0.1,\beta =0.1\)) | 9611 | 60,759 |
MinMax k-means (\(p=0.1,\beta =0\)) | 9630 | 60,753 |
MinMax k-means (\(p=0.02,\beta =0.3\)) | 10,920 | 60,805 |
MinMax k-means (\(p=0.02,\beta =0.1\)) | 10,919 | 60,805 |
MinMax k-means (\(p=0.02,\beta =0\)) | 10,915 | 60,805 |
MinMax k-means (\(p=0,\beta =0\)) | 11,539 | 60,962 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 6685 | 60,394 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 19,143 | 70,402 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 6891 | 60,234 |
Global Minmax k-means (\(p=0.4,\beta =0.3\)) | 6853 | 60,305 |
Global Minmax k-means (\(p=0.4,\beta =0.1\)) | 6828 | 60,300 |
Global Minmax k-means (\(p=0.4,\beta =0\)) | 6891 | 60,234 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 6994 | 60,181 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 6994 | 60,181 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 6994 | 60,179 |
Global Minmax k-means (\(p=0.2,\beta =0.3\)) | 10,860 | 59,918 |
Global Minmax k-means (\(p=0.2,\beta =0.1\)) | 10,860 | 59,918 |
Global Minmax k-means (\(p=0.2,\beta =0\)) | 10,860 | 59,918 |
Global Minmax k-means (\(p=0.1,\beta =0\)) | 11,601 | 59,710 |
Global Minmax k-means (\(p=0.02,\beta =0\)) | 12,330 | 59,645 |
Global Minmax k-means (\(p=0,\beta =0\)) | 12,523 | 59,643 |
Comparative results on the user knowledge modeling data set
Method | \(E_{max}\) | \(E_{sum}\) |
---|---|---|
k-Means | 13.9469 | 41.6798 |
Global k-means | 16.7506 | 41.2257 |
MinMax k-means (\(p=0.5,\beta =0.3\)) | 11.1298 | 41.5906 |
MinMax k-means (\(p=0.5,\beta =0.1\)) | 12.2885 | 42.2599 |
MinMax k-means (\(p=0.5,\beta =0\)) | 11.3447 | 41.6220 |
MinMax k-means (\(p=0.4,\beta =0.3\)) | 11.4587 | 41.5912 |
MinMax k-means (\(p=0.4,\beta =0.1\)) | 11.4362 | 41.5951 |
MinMax k-means (\(p=0.4,\beta =0\)) | 11.4776 | 41.5757 |
MinMax k-means (\(p=0.3,\beta =0.3\)) | 11.8978 | 41.5361 |
MinMax k-means(\(p=0.3,\beta =0.1\)) | 11.8994 | 41.5463 |
MinMax k-means (\(p=0.3,\beta =0\)) | 11.9395 | 41.5356 |
MinMax k-means (\(p=0.2,\beta =0.3\)) | 12.5516 | 41.5503 |
MinMax k-means (\(p=0.2,\beta =0.1\)) | 12.5544 | 41.5626 |
MinMax k-means (\(p=0.2,\beta =0\)) | 12.5672 | 41.5508 |
Global Minmax k-means (\(p=0.5,\beta =0.3\)) | 10.9221 | 41.2507 |
Global Minmax k-means (\(p=0.5,\beta =0.1\)) | 10.9221 | 41.2507 |
Global Minmax k-means (\(p=0.5,\beta =0\)) | 10.9221 | 41.2507 |
Global Minmax k-means (\(p=0.4,\beta =0.3\)) | 11.0574 | 41.1979 |
Global Minmax k-means (\(p=0.4,\beta =0.1\)) | 11.0574 | 41.1979 |
Global Minmax k-means (\(p=0.4,\beta =0\)) | 11.0574 | 41.1979 |
Global Minmax k-means (\(p=0.3,\beta =0.3\)) | 11.6460 | 41.0866 |
Global Minmax k-means (\(p=0.3,\beta =0.1\)) | 11.6460 | 41.0866 |
Global Minmax k-means (\(p=0.3,\beta =0\)) | 11.6460 | 41.0866 |
Global Minmax k-means (\(p=0.2,\beta =0.3\)) | 11.8169 | 41.0594 |
Global Minmax k-means (\(p=0.2,\beta =0.1\)) | 11.8169 | 41.0594 |
Global Minmax k-means (\(p=0.2,\beta =0\)) | 11.8169 | 41.0594 |
Global Minmax k-means (\(p=0.1,\beta =0\)) | 11.8169 | 41.0594 |
Global Minmax k-means (\(p=0,\beta =0\)) | 14.9083 | 41.4720 |
In the experiment, we find the memory parameter \(\beta \) and exponent parameter p affect the results in the MinMax k-means and the global Minmax k-means algorithm, and the variation does not have any rule. The practical framework that extends the MinMax k-means to automatically adapt the exponent to the data set proposed in Tzortzis and Likas (2014). They thought if the \(p_{max}\) has been set, the programme can reach the lowest \(E_{max}\) at \(p\in [p_{init},p_{max}]\). However, our experiments show that it is not always correct. In Tables 10 and 11, when we set \(p_{max}=0.3\), the results is better than \(p_{max}=0.5\). In the experiment, it is easy to show that \(E_{max}\) and \(E_{sum}\) can not attain the lowest value at a time.
Conclusions
We modified the global k-means algorithm to circumvent the singleton clusters. We also have presented the global Minmax k-means algorithm, with constitutes a deterministic clustering method in terms of the MinMax k-means clustering error i.e. minimize the maximum intra-cluster error. The method is independent of any starting conditions and compares favorably to the k-means algorithm and the MinMax k-means algorithm with multiple random restarts. We compare our method with the global k-means algorithm, too. The results of experiments show the advantage come together with the global k-means and the MinMax k-means algorithm i.e. we get a deterministic clustering method and need not any restart and our proposed algorithm always performs well.
As for future work, we plan to study in adapt method to determine the exponent parameter p and the memory parameter \(\beta \), such that \(E_{max}\) or \(E_{sum}\) attain the lowest. And it would be better for us to tackling the two parameters at one time.
Declarations
Authors' contributions
XW and YB proposed and designed the research; XW performed the simulations, analyzed the simulation results and wrote the paper. Both authors read and approved the final manuscript.
Acknowledgements
The authors are thankful for the support of the National Natural Science Foundation of China (61275120, 61203228, 61573016).
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: ACM-SIAM symposium on discrete algorithm (SODA), pp 1027–1035Google Scholar
- Bagirov AM (2008) Modified global k-means algorithm for minimum sum-of-squares clustering problems. Pattern Recognit 41:3192–3199View ArticleMATHGoogle Scholar
- Bagirov AM, Ugon J, Webb D (2011) Fast modified global k-means algorithm for incremental cluster construction. Pattern Recognit 44:866–876View ArticleMATHGoogle Scholar
- Banerjee A, Ghosh J (2004) Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres. IEEE Trans Neural Netw 15(3):702–719View ArticlePubMedGoogle Scholar
- Berkhin P (2006) A survey of clustering data mIning techniques. In: Kogan J, Nicholas C, Teboulle M (eds) Grouping multidimensional data: recent advances in clustering. Springer, Berlin, pp 25–71View ArticleGoogle Scholar
- Celebi ME, Kingravi H (2012) Deterministic initialization of the K-means algorithm using hierarchical clustering. Int J Pattern Recognit Artif Intell 26(7):1250018MathSciNetView ArticleGoogle Scholar
- Celebi ME, Kingravi H (2014) Linear, deterministic, and order-invariant initialization methods for the K-means clustering algorithm. In: Celebi ME (ed) Partitional clustering algorithms. Springer, Berlin, pp 79–98Google Scholar
- Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40:200–210View ArticleGoogle Scholar
- Eslamnezhad M, Varjani AY (2014) Intrusion detection based on MinMax K-means clustering. In: 2014 7th International symposium on telecommunications (IST’2014), pp 804–808Google Scholar
- Fang C, Jin W, Ma J (2013) \(k^{{\prime }}\)-Means algorithms for clustering analysis with frequency sensitive discrepancy metrics. Pattern Recognit Lett 34:580–586View ArticleGoogle Scholar
- Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
- Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31:651–666View ArticleGoogle Scholar
- Kalogeratos A, Likas A (2012) Dip-means: an incremental clustering method for estimating the number of clusters. In: Advances in neural information processing systems (NIPS), pp 2402–2410Google Scholar
- Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit 36:451–461View ArticleGoogle Scholar
- Murty MN, Jain AK, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323View ArticleGoogle Scholar
- Nene SA, Nayar SK, Murase H (1996) Columbia Object Image Library (COIL-20). Technical Report CUCS 005-96Google Scholar
- Peña JM, Lozano JA, Larrañaga P (1999) An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognit Lett 20:1027–1040View ArticleGoogle Scholar
- Tzortzis GF, Likas AC (2009) The global kernel k-means algorithm for clustering in feature space. IEEE Trans Neural Netw 20(7):1181–1194View ArticlePubMedGoogle Scholar
- Tzortzis G, Likas A (2014) The MinMax k-Means clustering algorithm. Pattern Recognit 47:2505–2516View ArticleGoogle Scholar
- Tzortzis G, Likas A (2008) The global kernel k-Means algorithm. In: International joint conference on neural networks (IJCNN), pp 1977–1984Google Scholar
- Xu R, Wunsch DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678View ArticlePubMedGoogle Scholar
- Zang X, Vista FP IV, Chong KT (2014) Fast global kernel fuzzy c-means clustering algorithm for consonant/vowel segmentation of speech signal. J Zhejiang Univ Sci C (Comput Electron) 15(7):551–563View ArticleGoogle Scholar