ICANN RECAP – NATION-STATE APT ATTRIBUTION USING END-TO-END DEEP NEURAL NETWORKS
We recently took part in the International Conference on Artificial Neural Networks(ICANN), one of the longest running neural network conferences in the world, in order to present our deep learning classification results to appear in the paper “DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks”, by Ishai Rosenberg, Guillaume Sicard, and Eli David.
To be more specific, at Deep Instinct we are stretching deep learning to its limit in order to present what we can do with deep learning other than what we are already currently doing: using it to detect malware, including ones never seen before. A deep learning classifier can tell us which nation state (allegedly) implemented specific malware.
1. Nation state APTs have unique challenges, making it harder to attribute them:Each nation-state (allegedly) has different cyber units, implementing APTs independently. Moreover, each such unit has different developers employed, each with its unique development style.
2. Most Nation-state APTs are involved in more than a single campaign, with different C&C servers, different static signatures, etc. to evade detection and attribution.
3. Nation-state APTs tend to implement many evasion techniques to avoid being classified as malicious: different packing and encryption methods, different malicious payloads and unique and nation-detected zero day exploitations.
4. Some nation-state APTs are even “nastier” as they embed “false features” to “frame” other nation-states for their APT if being caught. For instance, they may embed fonts in different languages, etc.
5. For machine learning based classifiers, the fact that there are very few nation-state APTs makes it more difficult to train a model that generalizes well to unseen samples.
Our Methodology:
We use input dynamic analysis reports generated by Cuckoo Sandbox and took the top 20,000 most common words in the cuckoo JSON reports, without parsing them first and without any further pre-processing or feature engineering, i.e., as Boolean features (whether each feature appears or not).
Due to the small quantity of available samples, we used only two classes: Russia and China. Our training set included 1,600 files from each class (training set size of 3,200 samples). The test set contained an additional 500 files from each class (test set size of 1,000 files).
In order to verify that our classifier generalizes correctly without “memorizing dynamic signatures”, the test set was composed of different campaigns, from different malware families that did not appear in the training set: Net-Traveler, Winnti/PlugX, Cosmic Duke and Sofacy/APT28.
We train a deep neural network comprising ten layers, and used several regularization methods combined. Our classifier achieved 94.6% accuracy, a very good result considering all the above-mentioned challenges and the fact that in contrast to our production models, we spent no time fine-tuning it to improve its performance. Also note that unlike this research, our production model for malware detection relies on static analysis only, and we use these dynamic based classification modules only after the malware has been detected and prevented by the main brain which runs on every endpoint and is responsible for static detection in a few milliseconds.
For more information, you can download the paper here.