Skip to content

Commit

Permalink
Update README.
Browse files Browse the repository at this point in the history
  • Loading branch information
Brandon Amos committed Sep 26, 2014
1 parent 70d8423 commit 3124358
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 44 deletions.
104 changes: 60 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,31 @@
# Deprecated
__This project has been deprecated and remains online
for historical archiving__

# Android Antimalware

## About
This is a set of shell scripts and a sample feature vector collection
application to help automate the training and testing of dynamic machine
learning malware classifiers. Arch Linux is the main testing platform.
An important concern on the growing Android platform is malware detection.
Malware detection techniques on the Android platform are similar to
techniques used on any platform. Detection is fundamentally broken into static
analysis, by analyzing a compiled file; dynamic analysis, by analyzing the
runtime behavior, such as battery, memory, and network utilization of the
device; or hybrid analysis, by combining static and dynamic techniques.
Static analysis is advantageous on memory-limited Android devices because the
malware is not executed, only analyzed. However, dynamic analysis provides
additional protection, particularly against polymorphic malware that change
form during execution.
**This project provides a framework to profile applications to obtain
feature vectors for dynamic analysis.**

This work was presented at the
[International Wireless Communications and Mobile Computing Conference (IWCMC) 2013][iwcmc-2013], and the paper is available [here][doi].
The set of feature vectors and classifiers are available for further
analysis in the `Results/IWCMC-2013` directory.

As projects mature, design decisions are tested, and the design
decision of using shell scripts as a framework does not deliver
a reliable control mechanism of error-prone emulators on a distributed system.
**Therefore, this project has been deprecated and remains online
for historical archiving.**
We are actively designing a new framework in Scala.

## Usage
# Usage
1. Populate `TestSuite/Training` and/or `TestSuite/Testing` with APK files
with the naming format `<M/B><Number>-<Name>.apk`. Where
+ `<M/B>` represents the classification of the application (malicious
Expand All @@ -24,44 +40,44 @@ and collecting feature vectors.
4. Feature vectors will be saved to `arff/` and the machine learning
classifiers will be accordingly trained and tested with `arff/weka.sh`

## Feature Vector Collection Application
The feature vector collection application called
Antimalware and is an Eclipse project. See below for a short section
on increasing Eclipse's memory if you are trying to load it in Eclipse.
The collected data is stored on an sdcard on the device.
# Experiment: Malware Classifier Performance
STREAM resides on the Android Tactical Application Assessment & Knowledge
(ATAACK) Cloud, which is a hardware platform designed to provide a testbed
for cloud– based analysis of mobile applications. The ATAACK cloud currently
uses a 34 node cluster, with each cluster machine containing Dell PowerEdge
M610 blade running CentOS 6.3. Each node has 2 Intel Xeon 564 processors
with 12 cores each along with 36GB of DDR3 ECC memory.

We used STREAM to send 10,000 input events to each application in the data set
and collect a feature vector every 5 seconds. We collected the following
set of features.

![](https://raw.githubusercontent.com/VT-Magnum-Research/antimalware/master/images/feature-vectors.png)

Feature vectors collected from the
training set of applications were used to create classifiers, and then feature
vectors from the testing set are used to evaluate the created malware
classifiers. Classification rates from the testing set are based on the 47
testing applications used. Future work includes increasing the testing set size
to increase confidence in these results.

## Directory Structure
.
├── Antimalware - The data collection application
│   ├── libs - The modified Weka library
│   └── src
├── arff - Collected feature vectors and classifiers
├── Results
└── TestSuite
├── AVDs
├── Device-Images
├── logs
├── Testing - Applications
└── Training - Applications
The following table shows descriptions of the metrics used to evaluate
classifiers.

## Increasing Eclipse's Default Memory
Importing the Antimalware Android project into Eclipse is simple. However,
Eclipse's memory needs to be increased to load the Weka library used.
First, find eclipse.ini in your system.
![](https://raw.githubusercontent.com/VT-Magnum-Research/antimalware/master/images/definitions.png)

$ sudo find / -name 'eclipse.ini'
/usr/share/eclipse/eclipse.ini
The overall results of training and testing six machine learning algorithms with
STREAM are shown in the following table.

Then edit it and increase the memory settings:
![](https://raw.githubusercontent.com/VT-Magnum-Research/antimalware/master/images/classifier-results.png)

$ vim /usr/share/eclipse/eclipse.ini
There is a clear difference in correct
classification percentage of the cross validation set (made up of applications
used in training) versus the testing set (made up of applications never used in
training). Feature vectors from the training set are classified quite well,
typically over 85% correct, whereas new feature vectors from the testing set
are often only classified 70% correctly. Classifier performance cannot be based
on cross validation solely, as it is prone to inflated accuracy results.

[...]
--launcher.XXMaxPermSize
2048m
[...]
--launcher.defaultAction
[...]
-Xms1024m
-Xmx2028m
[...]
[iwcmc-2013]: http://iwcmc.org/2013/
[doi]: http://dx.doi.org/10.1109/IWCMC.2013.6583806
Binary file added images/classifier-results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/definitions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3124358

Please sign in to comment.