- Exploratory data analysis (EDA), which
provides both descriptive and inferential analysis, plays a
crucial role in comprehending the significance of the
data's hidden information. The text corpus's subjects are
identified using the data mining method. The datasets
from Yelp, which contain information about businesses,
users, ratings, and signups, have been analyzed in this
study. In addition to timing of check-ins at company sites,
our study also looks at firm performance, regional
distribution, reviewer ratings, and other factors. We
discovered that Yelp check-ins, tips, and elite users had
all declined over time. Additionally, our analysis showed
that Canadians have more reliable star ratings and
sentiment ratings than Americans. To improve on this
effort, we suggest a new project that comprises gathering
a dataset, cleaning the data by removing null values,
applying a machine learning algorithm with Ada
Boosting, and forecasting the accuracy score with MLP.
The proposed technique for EDA and data mining on
Yelp restaurant reviews has various potential flaws.
Because the information was selected depending on the
needs of the research, it may not be representative of all
restaurants on Yelp. This might lead to skewed findings.
Pre-processing processes such as data cleaning and
sampling may remove vital information or inject noise
into the dataset. The model's performance and
generalizability may not be adequately assessed using
hold-out and cross-validation procedures.
Keywords :
Exploratory Data Analysis (EDA), Descriptive Analysis, Inferential Analysis, Data Mining, Yelp, Datasets, User Information, Ratings , Performance, Regional Distribution, Star Ratings, Sentiment Ratings, Machine Learning Algorithm, Ada Boosting, MLP, Accuracy Score, Data Cleaning