|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
An Accurate Packer Identification Method Using Support Vector Machine
Ryoichi ISAWA Tao BAN Shanqing GUO Daisuke INOUE Koji NAKAO
Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Vol.E97-A
No.1
pp.253-263 Publication Date: 2014/01/01 Online ISSN: 1745-1337
DOI: 10.1587/transfun.E97.A.253 Print ISSN: 0916-8508 Type of Manuscript: Special Section PAPER (Special Section on Cryptography and Information Security) Category: Foundations Keyword: malware analysis, pack, unpack, machine learning, SVM,
Full Text: PDF(1.4MB)>>
Summary:
PEiD is a packer identification tool widely used for malware analysis but its accuracy is becoming lower and lower recently. There exist two major reasons for that. The first is that PEiD does not provide a way to create signatures, though it adopts a signature-based approach. We need to create signatures manually, and it is difficult to catch up with packers created or upgraded rapidly. The second is that PEiD utilizes exact matching. If a signature contains any error, PEiD cannot identify the packer that corresponds to the signature. In this paper, we propose a new automated packer identification method to overcome the limitations of PEiD and report the results of our numerical study. Our method applies string-kernel-based support vector machine (SVM): it can measure the similarity between packed programs without our operations such as manually creating signature and it provides some error tolerant mechanism that can significantly reduce detection failure caused by minor signature violations. In addition, we use the byte sequence starting from the entry point of a packed program as a packer's feature given to SVM. That is, our method combines the advantages from signature-based approach and machine learning (ML) based approach. The numerical results on 3902 samples with 26 packer classes and 3 unpacked (not-packed) classes shows that our method achieves a high accuracy of 99.46% outperforming PEiD and an existing ML-based method that Sun et al. have proposed.
|
open access publishing via
|
|
|
|
|
|
|
|