A framework for detection of malicious code by exploiting machine learning techniques on portable executables| International Journal of Innovative Science and Research Technology

A Framework for Detection of Malicious Code by Exploiting Machine Learning Techniques on Portable Executables

Authors : Yash Gajjar; Vaishnavi Sharma; Sanskruti Bhatt; Dr. Maitri Jhaveri

Volume/Issue : Volume 9 - 2024, Issue 3 - March

Google Scholar : https://tinyurl.com/bdzeeszp

Scribd : https://tinyurl.com/mr29sda8

DOI : https://doi.org/10.38124/ijisrt/IJISRT24MAR2188

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Executable files coming from the internet bring along with them many potential hazards and vul- nerabilities in the form of malware to computer systems. The executables can be of form raw binaries, mnemonics, libraries, and function calls/APIs. They can misguide many of the conventional malware detection techniques. This paper explores the potential of Machine Learning- based methods for malware detection problems. The scope of the work here is currently limited to Static Anal- ysis of Executable files. Various feature selection tech- niques are implemented to reduce the size of the training data. Machine learning algorithms like K-Nearest Neigh- bors and Random Forest Classifier were trained on the curated feature sets. The outperforming experiment re- sult was shown by the Random Forest Classifier having an accuracy of 99.5%. We have developed a framework as a two-step module; in the first step, a list of features are extracted from a given executable file, and then for the next step, trained algorithm is integrated into the framework which will classify whether the given executa- ble file is malicious or not. This framework is demon- strated in the form of a Webapp developed in Python. Furthermore, this framework is evaluated based on its performance on a small dataset containing 35 portable executables (.exe) files and it is observed to be retaining the accuracy of the trained algorithm.

Keywords : Portable Executables (PE), Malicious Code, Machine Learning (ML).

Executable files coming from the internet bring along with them many potential hazards and vul- nerabilities in the form of malware to computer systems. The executables can be of form raw binaries, mnemonics, libraries, and function calls/APIs. They can misguide many of the conventional malware detection techniques. This paper explores the potential of Machine Learning- based methods for malware detection problems. The scope of the work here is currently limited to Static Anal- ysis of Executable files. Various feature selection tech- niques are implemented to reduce the size of the training data. Machine learning algorithms like K-Nearest Neigh- bors and Random Forest Classifier were trained on the curated feature sets. The outperforming experiment re- sult was shown by the Random Forest Classifier having an accuracy of 99.5%. We have developed a framework as a two-step module; in the first step, a list of features are extracted from a given executable file, and then for the next step, trained algorithm is integrated into the framework which will classify whether the given executa- ble file is malicious or not. This framework is demon- strated in the form of a Webapp developed in Python. Furthermore, this framework is evaluated based on its performance on a small dataset containing 35 portable executables (.exe) files and it is observed to be retaining the accuracy of the trained algorithm.

Keywords : Portable Executables (PE), Malicious Code, Machine Learning (ML).

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.