Authors :
Yash Gajjar; Vaishnavi Sharma; Sanskruti Bhatt; Dr. Maitri Jhaveri
Volume/Issue :
Volume 9 - 2024, Issue 3 - March
Google Scholar :
https://tinyurl.com/bdzeeszp
Scribd :
https://tinyurl.com/mr29sda8
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24MAR2188
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Executable files coming from the internet
bring along with them many potential hazards and vul-
nerabilities in the form of malware to computer systems.
The executables can be of form raw binaries, mnemonics,
libraries, and function calls/APIs. They can misguide
many of the conventional malware detection techniques.
This paper explores the potential of Machine Learning-
based methods for malware detection problems. The
scope of the work here is currently limited to Static Anal-
ysis of Executable files. Various feature selection tech-
niques are implemented to reduce the size of the training
data. Machine learning algorithms like K-Nearest Neigh-
bors and Random Forest Classifier were trained on the
curated feature sets. The outperforming experiment re-
sult was shown by the Random Forest Classifier having
an accuracy of 99.5%. We have developed a framework
as a two-step module; in the first step, a list of features
are extracted from a given executable file, and then for
the next step, trained algorithm is integrated into the
framework which will classify whether the given executa-
ble file is malicious or not. This framework is demon-
strated in the form of a Webapp developed in Python.
Furthermore, this framework is evaluated based on its
performance on a small dataset containing 35 portable
executables (.exe) files and it is observed to be retaining
the accuracy of the trained algorithm.
Keywords :
Portable Executables (PE), Malicious Code, Machine Learning (ML).
Executable files coming from the internet
bring along with them many potential hazards and vul-
nerabilities in the form of malware to computer systems.
The executables can be of form raw binaries, mnemonics,
libraries, and function calls/APIs. They can misguide
many of the conventional malware detection techniques.
This paper explores the potential of Machine Learning-
based methods for malware detection problems. The
scope of the work here is currently limited to Static Anal-
ysis of Executable files. Various feature selection tech-
niques are implemented to reduce the size of the training
data. Machine learning algorithms like K-Nearest Neigh-
bors and Random Forest Classifier were trained on the
curated feature sets. The outperforming experiment re-
sult was shown by the Random Forest Classifier having
an accuracy of 99.5%. We have developed a framework
as a two-step module; in the first step, a list of features
are extracted from a given executable file, and then for
the next step, trained algorithm is integrated into the
framework which will classify whether the given executa-
ble file is malicious or not. This framework is demon-
strated in the form of a Webapp developed in Python.
Furthermore, this framework is evaluated based on its
performance on a small dataset containing 35 portable
executables (.exe) files and it is observed to be retaining
the accuracy of the trained algorithm.
Keywords :
Portable Executables (PE), Malicious Code, Machine Learning (ML).