A survey of clustering algorithm for very large datasets| International Journal of Innovative Science and Research Technology

A Survey of Clustering Algorithm for Very Large Datasets

Authors : Tmty. P. Aruna Devi, Dr.(Tmty.) M. Chamundeeswari

Volume/Issue : Volume 2 - 2017, Issue 11 - November

Google Scholar : https://goo.gl/DF9R4u

Thomson Reuters ResearcherID : https://goo.gl/3bkzwv

Abstract : Clustering in data mining is viewed as unsupervised method of data analysis. Clustering allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Clustering helps to discover groups and identifies interesting distributions in the underlying data. It is one of the most useful technique and used in exploratory analysis of data. It is also used in various areas such as grouping, decision-making, and machine-learning situations, including data mining, document retrieval, image segmentation, classification and image processing. Traditional clustering algorithms both favour clusters with spherical shapes and similar sizes, and are very fragile in the presence of outliers. Clustering plays a major role in analysis of very large data set and it is useful to discover the correlation among attributes both of spherical and non spherical shape which is also robust to outliers. This survey focuses on clustering algorithms that are used on very large data sets which help to find the characteristic of the data. We have taken the best clustering algorithm such as BIRCH, BFR and CURE.

Keywords : Hierarchical, Centroid, CF Tree, Incremental Algorithm, Representation Points, Classes of Points.

Clustering in data mining is viewed as unsupervised method of data analysis. Clustering allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Clustering helps to discover groups and identifies interesting distributions in the underlying data. It is one of the most useful technique and used in exploratory analysis of data. It is also used in various areas such as grouping, decision-making, and machine-learning situations, including data mining, document retrieval, image segmentation, classification and image processing. Traditional clustering algorithms both favour clusters with spherical shapes and similar sizes, and are very fragile in the presence of outliers. Clustering plays a major role in analysis of very large data set and it is useful to discover the correlation among attributes both of spherical and non spherical shape which is also robust to outliers. This survey focuses on clustering algorithms that are used on very large data sets which help to find the characteristic of the data. We have taken the best clustering algorithm such as BIRCH, BFR and CURE.

Keywords : Hierarchical, Centroid, CF Tree, Incremental Algorithm, Representation Points, Classes of Points.

Paper Submission Last Date
31 - July - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.