Collective Classification for Packed Executable Identification

Resumen

Malware writers employ packing techniques (i.e., encrypt the real payload) to hide the actual code of their creations. Generic unpacking techniques execute the binary within an isolated environment (namely ‘sandbox’) to gather the real code of the packed executable. However, this approach can be very time consuming. A common approach is to apply a filtering step to avoid the execution of not packed binaries. To this end, supervised machine learning models trained with static features from the executables have been proposed. Notwithstanding, these methods need the identification and labelling of a high number of packed and not packed executables. In this paper, we propose a new method for packed executable detection that adopts collective learning approaches (a kind of semi-supervised learning) to reduce the labelling requirements of completely supervised approaches. We performed an empirical validation demonstrating that the system maintains a high accuracy rate when the number of labelled instances in the dataset is lower.

Publicación
International Journal of Computer Systems Science & Engineering

Relacionado