Publications

Detailed Information

Measuring Source Code Similarity by Finding Similar Subgraph with an Incremental Genetic algorithm

Cited 5 time in Web of Science Cited 7 time in Scopus
Authors

Kim, Jinhyun; Choi, HyukGeun; Yun, Hansang; Moon, Byung-Ro

Issue Date
2016-07
Publisher
ACM/IEEE
Citation
Proceeding GECCO '16 Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 925-932
Keywords
Measuring Source Code Similarity by Finding Similar Subgraph with an Incremental Genetic algorithm복합학Code similaritysubgraph isomorphism problemincremental genetic algorithmprogram dependence graph
Abstract
Measuring similarity between source codes has lots of applications, such as code plagiarism detection, code clone detection, and malware detection. A variety of methods for the measurement have been developed and program-dependence-graph based methods are known to be well working against disguise techniques. But these methods usually rely on solving NP-hard problems which cause a scalability issue. In this paper, we propose a genetic algorithm to measure the similarity between two codes by solving an error correcting subgraph isomorphism problem on dependence graphs. We propose a new cost function for this problem, which reflects the characteristic of the source codes. An incremental genetic algorithm is used to solve the problem. The size of the graph to be searched is gradually increasing during the evolutionary process. We developed new operators for the algorithm, and the overall system is tested on some real world data. Experimental results showed that the system successfully works on code plagiarism detection and malware detection. The similarity computed by the system turned out to reflect the similarity between the codes properly.
Language
English
URI
https://hdl.handle.net/10371/116915
DOI
https://doi.org/10.1145/2908812.2908870
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share