Stats and Results

APKiD Stats

Dataset dx dexmerge dexlib 1.X/2.X Total
Drebin (minus Malgenome) 52% -- 48% 4326
Malgenome 84% -- 16% 1234
GPlay 61% 34 5% 1882
Piggybacking (original) 61% 22% 17% 1355
Piggybacking (piggybacked) 22% 6% 71% 1399

Features

Static Features

Category Feature
basic Min. SDK version
Max. SDK version
Total # of activities
Total # of services
Total # of broadcast receivers
Total # of content providers
permission Total requested permissions
Android permissions / Total permissions
Custom permissions / Total permissions
Dangerous permissions / Total permissions
API Counts of calls to sensitive API packages/modules. See full list

Dynamic Features

Dynamic features were extracted from the API call traces generated by droidmon. They comprise counts of API categories hooked by the tool and listed in a file called hooks.json. Click here to download the list of categories and hooked methods in JSON format    Download


Datasets

Malgenome

Total number of apps: 1234

Click here to download the hashes     Download

GPlay

Total number of apps: 1882

Click here to download the names of apps     Download

Piggybacking

Total number of unique original apps: 1355

Total number of unique piggybacked apps: 1399

Click here to download the hashes of both apps organized in pairs of (original_app,piggybacked_app)     Download

Please refer to Piggybacking's website for more information about the dataset.

Results

Here you can find a summary of all the scores achieved by all classifiers during all runs

Static Experiments

For the static experiments, we included all the scores achieved by all categories of static features using all classifiers for each of the 25 runs. These scores are stored as a Python dictionary that can seamlessly be imported using the command eval(open("file_name.data").read()). Here is a list of all available dictionaries:

  • Malgenome+GPlay Static Results (25 runs + All features)    Download
  • Malgenome+GPlay Static Results (25 runs + Basic features)    Download
  • Malgenome+GPlay Static Results (25 runs + Permission features)    Download
  • Malgenome+GPlay Static Results (25 runs + API features)    Download
  • Piggybacking Static Results (25 runs + All features)    Download
  • Piggybacking Static Results (25 runs + Basic features)    Download
  • Piggybacking Static Results (25 runs + Permission features)    Download
  • Piggybacking Static Results (25 runs + API features)    Download

Dynamic Experiments

We stored all the results of dynamic experiments in a SQLite database that has the following schema:


CREATE TABLE learner( 
    learnerID		INTEGER PRIMARY KEY AUTOINCREMENT, 
    learnerName 	TEXT
);

CREATE TABLE run( 
    runID       	INTEGER, 
    runDataset  	TEXT,
    runStart  		TEXT,
    runEnd		TEXT,
    runIterations	INTEGER,
    PRIMARY KEY (runID, runDataset)
);

CREATE TABLE app( 
    appID       	INTEGER PRIMARY KEY AUTOINCREMENT, 
    appName    		TEXT, 
    appType 		TEXT,
    appRunID  		INTEGER,
    appRuns		INTEGER,
    FOREIGN KEY (appRunID) REFERENCES parent(runID)
);

CREATE TABLE datapoint ( 
    dpID        	INTEGER PRIMARY KEY AUTOINCREMENT, 
    dpLearner		INTEGER,
    dpIteration		INTEGER,
    dpRun		INTEGER,
    dpTimestamp 	TEXT,
    dpFeature           TEXT,
    dpType          	TEXT,
    dpAccuracy		REAL,
    dpRecall		REAL,
    dpSpecificity	REAL,
    dpPrecision		REAL,
    dpFscore		REAL,
    FOREIGN KEY (dpLearner) REFERENCES parent(learnerID),
    FOREIGN KEY (dpRun) REFERENCES parent(runID)
);
INSERT INTO learner (learnerName) VALUES ("KNN10");
INSERT INTO learner (learnerName) VALUES ("KNN25");
INSERT INTO learner (learnerName) VALUES ("KNN50");
INSERT INTO learner (learnerName) VALUES ("KNN100");
INSERT INTO learner (learnerName) VALUES ("KNN250");
INSERT INTO learner (learnerName) VALUES ("KNN500");
INSERT INTO learner (learnerName) VALUES ("Trees10");
INSERT INTO learner (learnerName) VALUES ("Trees25");
INSERT INTO learner (learnerName) VALUES ("Trees50");
INSERT INTO learner (learnerName) VALUES ("Trees75");
INSERT INTO learner (learnerName) VALUES ("Trees100");
INSERT INTO learner (learnerName) VALUES ("SVM");
INSERT INTO learner (learnerName) VALUES ("Ensemble");
                            

You can download the database here   Download