Cluster is a group of objects that are similar amongst themselves but dissimilar to the objects in other clusters. Identifying meaningful clusters and thereby a structure in a large un-labelled dataset is an important unsupervised data mining task. Technological progress leads to enlarging volumes of data that require clustering. Clustering large datasets is a challenging resource-intensive task and the key to scalability and performance beneﬁts is to use parallel or concurrent clustering algorithms. Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Machine learning techniques are widely used in the medical field for the diagnosis of various diseases.
Keywords : Cluster; Data Mining; Machine Learning; Supervised; Unsupervised; Hard C Means; Fuzzy C Means; Naïve Bayes; Classification; Dataset; Cluster Quality; Time Consumed; True Positive Rate; False Positive Rate; Precision; Recall; Accuracy.