Authors :
Dr. A. Srinivasa Rao; Challa Manikanta; Akuri Azad; Kandagatla Rushikesh; Polavarapu Rohitha
Volume/Issue :
RISEM–2025
Google Scholar :
https://tinyurl.com/mv478wh5
Scribd :
https://tinyurl.com/4m3t6nnj
DOI :
https://doi.org/10.38124/ijisrt/25jun173
Abstract :
Accurate and timely forecasting is essential for improving outcomes for patients because lung cancer is still one of
the deadly illnesses in the world because of its late-stage detection and lack of effective early detection techniques.
Conventional diagnostic methods frequently involve invasive procedures, and although medical imaging methods like CT
scans and X-rays offer useful information they need to be interpreted by professionals and can occasionally result in
incorrect diagnoses or postponed treatment. Furthermore, cancer of the lung risk is influenced by various variables
including demographic variables (such as gender and age), daily behaviors (such as smoking) and signs of disease (such as
persistent cough and other lung symptoms). This study uses machine learning approaches to create a strong lung cancer
prediction model based on a large dataset that includes clinical, lifestyle and demographic characteristics in order to
overcome these obstacles and improve predicted accuracy. To guarantee data quality and dependability prior to model
training the dataset is subjected to comprehensive exploratory data analysis (EDA), preprocessing and feature scaling. Key
performance metrics like accuracy, mean squared error (MSE), mean absolute error (MAE) and mean absolute percentage
error (MAPE) are used to implement and assess a variety of machine learning models, such as Deep Neural Networks (DNN),
Decision Trees and Random Forests The Decision Tree and Random Forest models perform noticeably better with
accuracies of 90.32% and 93.54%, respectively whereas the DNN model performs sub optimally, according to preliminary
results, with an accuracy of 12.62%. The Bagging Classifier is used to further optimize the Random Forest model which has
the best accuracy in order to improve performance and stability.
Keywords :
CNN, Data Augmentation, Disease Detection, Plant Health, Real-Time Detection and Streamlit.
Accurate and timely forecasting is essential for improving outcomes for patients because lung cancer is still one of
the deadly illnesses in the world because of its late-stage detection and lack of effective early detection techniques.
Conventional diagnostic methods frequently involve invasive procedures, and although medical imaging methods like CT
scans and X-rays offer useful information they need to be interpreted by professionals and can occasionally result in
incorrect diagnoses or postponed treatment. Furthermore, cancer of the lung risk is influenced by various variables
including demographic variables (such as gender and age), daily behaviors (such as smoking) and signs of disease (such as
persistent cough and other lung symptoms). This study uses machine learning approaches to create a strong lung cancer
prediction model based on a large dataset that includes clinical, lifestyle and demographic characteristics in order to
overcome these obstacles and improve predicted accuracy. To guarantee data quality and dependability prior to model
training the dataset is subjected to comprehensive exploratory data analysis (EDA), preprocessing and feature scaling. Key
performance metrics like accuracy, mean squared error (MSE), mean absolute error (MAE) and mean absolute percentage
error (MAPE) are used to implement and assess a variety of machine learning models, such as Deep Neural Networks (DNN),
Decision Trees and Random Forests The Decision Tree and Random Forest models perform noticeably better with
accuracies of 90.32% and 93.54%, respectively whereas the DNN model performs sub optimally, according to preliminary
results, with an accuracy of 12.62%. The Bagging Classifier is used to further optimize the Random Forest model which has
the best accuracy in order to improve performance and stability.
Keywords :
CNN, Data Augmentation, Disease Detection, Plant Health, Real-Time Detection and Streamlit.