Collating SQL Databases, No-SQL Databases and Machine Learning Algorithms for Data Analysis


Authors : Dylan Coelho; Cliff Machado; Leon Correia; Shree Jaswal; Neil Fernando

Volume/Issue : Volume 7 - 2022, Issue 4 - April

Google Scholar : https://bit.ly/3IIfn9N

Scribd : https://bit.ly/3Nsllyl

DOI : https://doi.org/10.5281/zenodo.6562532

Big Data Tools and Machine learning algorithms have been applied to data analytics and prediction frequently. This paper evaluates and illustrates the differences between SQL and NoSQL for storage of Big Data and processing and compares various algorithms used for analysis and predictions. The paper shows our basic understanding of Hadoop and Spark cloud and compares the two platforms on various parameters such as the time taken for input data and the time taken for the output data and the total memory used by the databases. The system has implementing the Databases in Hadoop and Spark.In Hadoop, the Hive database will be used for implementingthe SQL part and Cassandra for NOSQL. In Spark the SQLpart will be implemented using Post GreSQL and NOSQL uses MongoDB. We get the end results by comparing various parameters like the input, output data and the total memory used will be represented graphically after which a user will be in a position to choose the appropriate database accordingto their requirements. Additionally, we will also be studyingand comparing various Machine Learning algorithms by implementing them on the selected dataset. To compare the algorithms, we will be considering parameters of Accuracy, Root Mean Square Error and Mean Absolute Value. Choosing the right machine learning algorithm can be difficult, but doing so is essential to answering the given question with great speed and accuracy. In order for the user to yield the required insights, algorithms must be carefully analysed and studied upon considering parameters like these. The final research results will be illustrated with the help of graph on a UI which will help to better understand the results obtained on our selected datasetfor this particular paper.

Keywords : Hadoop, NoSQL, Spark, SQL.

CALL FOR PAPERS


Paper Submission Last Date
31 - July - 2022

Paper Review Notification
In 1-2 Days

Paper Publishing
In 2-3 Days

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe