Title: | k-Means Algorithm with a Majorization-Minimization Method |
---|---|
Description: | A hybrid of the K-means algorithm and a Majorization-Minimization method to introduce a robust clustering. The reference paper is: Julien Mairal, (2015) <doi:10.1137/140957639>. The two most important functions in package 'MajMinKmeans' are cluster_km() and cluster_MajKm(). Cluster_km() clusters data without Majorization-Minimization and cluster_MajKm() clusters data with Majorization-Minimization method. Both of these functions calculate the sum of squares (SS) of clustering. Another useful function is MajMinOptim(), which helps to find the optimum values of the Majorization-Minimization estimator. |
Authors: | Sheikhi Ayyub [aut, cre], Yaghoobi Mohammad Ali [aut] |
Maintainer: | Sheikhi Ayyub <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2025-02-12 04:40:14 UTC |
Source: | https://github.com/cran/MajMinKmeans |
clusters data into two clusters. This functionis uses the kmeans
function to cluster the data and exports the clustering results as well as the sum of square (SS) of clustering using the Euclidian distance.
clusters_km(x, k = 2)
clusters_km(x, k = 2)
x |
matrix of data (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
k |
number of clusters ( this version considers 2 clusters ) |
sum of square (SS) of clustring
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M<- X[sample(nrow(X), 2),] clusters_km(X,2) }
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M<- X[sample(nrow(X), 2),] clusters_km(X,2) }
clusters data into two clusters with a majorization k-means This functionis use a hybrid of the k-means and the majorizaion-minimazation method to cluster the data and exports the clustering results as well as the sum of square (SS) of clustering
clusters_MajKm(X, k = 2, La)
clusters_MajKm(X, k = 2, La)
X |
matrix of data (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
k |
number of clusters ( this version considers 2 clusters ) |
La |
the tunnung parameter |
sum of square (SS) of clustring and the 'delta' (difference of two successive majorization function).
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] clusters_MajKm(X,2, 0.5) }
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] clusters_MajKm(X,2, 0.5) }
Calculates the Euclidian distance between points. This function can use in kmeans
function to do the clustering procedure using the Euclidian distance.
Euclid(x, mu)
Euclid(x, mu)
x |
matrix of data (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
mu |
initial seleted centroids (randomly or another method). |
Euclidian distance between two points.
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] Euclid(X,M) }
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] Euclid(X,M) }
k-means algorithm in clustering. This function export the clustered results based on one replication of the k-means method
kmeans(x, centers, nItter = 4)
kmeans(x, centers, nItter = 4)
x |
matrix of data (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
centers |
initial seleted centroids (randomly or another method) |
nItter |
Number of itteration function |
clustered results based on k-means methods.
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] kmeans(X,M, 4) }
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] kmeans(X,M, 4) }
Finding the optimized majorization-minimization centers
MajMinOptim(X, Z, M, eps, lambda)
MajMinOptim(X, Z, M, eps, lambda)
X |
matrix of data (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
Z |
is a n by k matrix where for all i and j, zi,j is abinary variable that is equal to 1 if the case i is assigned to cluster j and zero otherwise. (dim 1: samples (must be equal to dim 1 of X), dim 2: attributes (must be equal to dim 2 of X)) |
M |
initial seleted centroids (randomly or another method) |
eps |
a threshold value assumed as 0.0001 |
lambda |
a threshold value assumed as 0.5 |
The optimized majorization-minimization centers.
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] distsToCenters <- Euclid(X, M) clusters <- apply(distsToCenters, 1, which.min) Z <- matrix(0, nrow = NROW(X), ncol = 1) for(i in 1:NROW(X)) if (clusters[[i]] == 1) Z[i,]=clusters[[i]] Z=cbind(Z, 1-Z) MajMinOptim(X,Z,M ,eps=1e-4, lambda=.5) }
{ X=rbind(matrix(rnorm(1000*2 ,4,.1),1000,2),matrix(rnorm(1000*2, 3, 0.2),1000,2)) M <- X[sample(nrow(X), 2),] distsToCenters <- Euclid(X, M) clusters <- apply(distsToCenters, 1, which.min) Z <- matrix(0, nrow = NROW(X), ncol = 1) for(i in 1:NROW(X)) if (clusters[[i]] == 1) Z[i,]=clusters[[i]] Z=cbind(Z, 1-Z) MajMinOptim(X,Z,M ,eps=1e-4, lambda=.5) }