Authors :
Seid Mehammed, Sebahadin Nasir
Volume/Issue :
Volume 8 - 2023, Issue 1 - January
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3wFCq1d
DOI :
https://doi.org/10.5281/zenodo.7587990
Abstract :
The software's core data and business logic
are believed to be contained in the source code.
Therefore, the necessity for a semantically soundly
linked and structured code data management system is
a major challenge in the field of software engineering.
This paper investigates a domain ontology-based
automatic knowledge graph creation method for C#
source code. The semantic web, open-source developers,
knowledge management, expert systems, and online
communities are just a few of the fields where software
engineers may now understand and analyze code in a
semantic manner. By layering conditional random
fields on top of a trained Bi-LSTM network, candidate
terms for concepts or entities were extracted.The
models were automatically trained on a labeled data
corpus while also being manually defined. To improve
the classification of terms in a particular source code,
BI-LSTM and CRF are integrated. Other
characteristics to be extracted from the source code
were defined in addition to the basic CRF features,
which helped the model understand the categorization
constraints. Then, the Bi-LSTM model was utilized to
extract relations (taxonomic and non-taxonomic). Max
pooling has been used to integrate the links between
concepts at the word and code levels.
Keywords :
Knowledge graph, ontology,, knowledge base.
The software's core data and business logic
are believed to be contained in the source code.
Therefore, the necessity for a semantically soundly
linked and structured code data management system is
a major challenge in the field of software engineering.
This paper investigates a domain ontology-based
automatic knowledge graph creation method for C#
source code. The semantic web, open-source developers,
knowledge management, expert systems, and online
communities are just a few of the fields where software
engineers may now understand and analyze code in a
semantic manner. By layering conditional random
fields on top of a trained Bi-LSTM network, candidate
terms for concepts or entities were extracted.The
models were automatically trained on a labeled data
corpus while also being manually defined. To improve
the classification of terms in a particular source code,
BI-LSTM and CRF are integrated. Other
characteristics to be extracted from the source code
were defined in addition to the basic CRF features,
which helped the model understand the categorization
constraints. Then, the Bi-LSTM model was utilized to
extract relations (taxonomic and non-taxonomic). Max
pooling has been used to integrate the links between
concepts at the word and code levels.
Keywords :
Knowledge graph, ontology,, knowledge base.