A Transformer on Tabular Data Comparative Analysis with Linear and Tree Base Machine Learning Algorithm on Diabetic Dataset


Authors : Kamin Gorettie Precody; Komiwe Faith Phiri; Dr. Ashish Kumar Chakraverti

Volume/Issue : Volume 8 - 2023, Issue 5 - May

Google Scholar : https://bit.ly/3TmGbDi

Scribd : https://tinyurl.com/y9t89w6t

DOI : https://doi.org/10.5281/zenodo.7994977

- Lifestyle diseases have a rating of 80% as one of the top causes of death. About over 41 million lives are claimed just by lifestyle diseases, which are over 70% of all deaths around the world. In this same percentage about roughly 15 million deaths happen to people of the age range 30 to about 69 years. Lifestyle diseases are primarily originated due to the day-to-day habits of an individual. These habits that detract from activities and push people towards a sedentary routine can cause numerous health issues that may lead to harmful diseases that are nearly life-threatening. Furthermore, there are two common complex diseases that are heart disease and diabetes, researchers have discovered diabetes to be a silent but deadly disease, and many researchers use machine learning methods to help medical professionals for the diagnosing of lifestyle diseases. This paper reviewed the literature on predictions and diagnoses of lifestyle diseases with the use of transformers and machine learning techniques it is presented and used on Diabetics data of patients. Our research paper will highlight the importance of transformers and machine learning in analyzing huge datasets of patients to predict the whole kinds of diabetes and how they can be treated and how they can be prevented. Further, we have utilized Transformers on tabular data (Tabpfn), Random Forest, Decision Tree, Support Vector Machine K-Nearest Neighbors, Gradient Boosting, Histogram Gradient Boosting, and Adaptive Boosting for predicting how likely a person will have a bank account. The stratified holdout cross-validation method has been used to split the training dataset randomly into 90% train and 10% test sets. The result was collected and further compared with some existing approaches, which indicates that using transformers on tabular data (Tabpfn) outperforms the existing state-ofthe-art approach. The Tabpfn transformer on tabular data was optimal among adapted models based on F1- score, which are 98.46 %, 98.0694%, 91.736%, and 91.541% respectively.

Keywords : Transformer, Lifestyle Diseases, Machine Learning Techniques, Prediction.

CALL FOR PAPERS


Paper Submission Last Date
30 - April - 2024

Paper Review Notification
In 1-2 Days

Paper Publishing
In 2-3 Days

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe