Using Opcode Sequences in Single-Class Learning to Detect Unknown Malware

Abstract

Malware is any type of malicious code that has the potential to harm a computer or network. The volume of malware is growing at a faster rate every year and poses a serious global security threat. Although signature-based detection is the most widespread method used in commercial antivirus programs, it consistently fails to detect new malware. Supervised machine-learning models have been used to address this issue. However, the use of supervised learning is limited because it needs a large amount of malicious code and benign software to be labelled first. In this study, the authors propose a new method that uses single-class learning to detect unknown malware families. This method is based on examining the frequencies of the appearance of opcode sequences to build a machine-learning classifier using only one set of labelled instances within a specific class of either malware or legitimate software. The authors performed an empirical study that shows that this method can reduce the effort of labelling software while maintaining high accuracy.

Publication
IET Information Security

Related