-
Notifications
You must be signed in to change notification settings - Fork 0
HYChou0515/twovar-v2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is the experiment code for the paper: "Dual Coordinate-Descent Methods for Linear One-Class SVM and SVDD". This code will generate the experimental result figures in the paper. For the figures of O(n) operations, the result should be exactly the same. However, for the time experiments, the result may be slightly different due to the CPU frequency and the work load of your computer. This code has been tested under a 64-bit Linux environment. To run this code, there are several requirements, please see Section System Requirement. The information of the experiment data can be found in Section Experiment Data. To get the result of the main paper, you can follow the step-by-step instruction in Section Run the Experiment. If you want to use different settings to run the experiment, you may want to see how to configure the experiment in Section Experiment Configuration. Since experiments are independent to each other, in Section Run Experiment Parallelly shows how to run the experiments in parallel. Table of Contents ================= - System Requirement - Experiment Data - Quick Start - Run the Experiment - Experiment Configuration - Run Experiment Parallelly System Requirement ================== This experiment should be running under 64-bit Linux environments. The following commands or tools are suggested: - UNIX utilities (cp, mkdir) - gcc 4.4.3 or newer version - Python 3.6.3 - matplotlib 1.5.3 - make Experiment Data =============== Experiment Data used in this paper are available at http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ Run the Experiment ================== I. Preparation -------------- For step 1-3, go into the directory `lrocsvm/code'. 1. Put your dataset in the directory `../data/'. 2. Create directories `./log/', `./resume/' and `./figure/'. 3. Run `make'. II. Generate Logs ---------------- For step 4-8, go into the directory `lrocsvm/code/script'. 4. The experiment configuration is in `./expconf.py', use the default one or edit it to what you want. The default experiment runs the result in the main paper. Other experiments are labeled as `./expconf.py.fig*'. To run them, just do `mv ./expconf.py.fig* ./expconf.py'. See how to edit expconf.py in Section Experiment Configuration. 5. Run `./runexp.py time'. This command generate a bash script `out_run.bash'. 6. Run the script by `bash out_run.bash'. In the default setting, it should take up to 60 seconds for each line of the script. To run the experiments parallelly, see imformation in Section Run Experiment Parallelly. 7. See logs in `../log'. III. From Logs Generate Figures ------------------------------ 8. Run `./plot.py --logpath ../log --relobj 0 --suffix d0 --scaled 1 nrnop' to generate figures in `../figure/'. For `./expconf.py.fig8-9', run `./plot.py --logpath ../log --relobj 0 --suffix d0 --scaled 1 time' instead. Experiment Configuration ======================== I. Terminology -------------- The terminologies used in the paper and in this code are different. In this section we make connection of the two. The term in the code is structured like {ONECLASS or SVDD}_L1_{SETTING }_{1000 or SH} {-----1st-part---}----{2nd-part}-{-3rd-part-} 1. The first part indicates it's a oneclass SVM or SVDD. 2. The second part is the main setting. The following table shows the relationship to the paper. ---------------------------------------------------- | SETTING | Paper | |------------------------|---------------------------| | CY | cyclic-2cd | | FIRST | greedy-2cd | | SEMIGD_CY_FIRST | cyclic-|B|cd-greedy | | SEMIGD_FIRST_CY | greedy-R-cyclic | | SEMIGD_BATCH_FIRST_CY | cyclic-R̄-greedy-R-cyclic | ---------------------------------------------------- 3. In the third part, SH means it is with shrinking technique. 1000 means it is without shrinking technique. Furthermore, there is no stopping condition and runs 1000 iterations by default. The maximum iteration is set by -m option. Specifically, ONECLASS/SVDD_L1_CY_1000 is Algorithm 2, ONECLASS/SVDD_L1_SEMIGD_CY_FIRST_1000 is Algorithm 4, ONECLASS/SVDD_L1_SEMIGD_FIRST_CY_1000 is Algorithm 5 and ONECLASS/SVDD_L1_SEMIGD_FIRST_CY_SH is Algorithm 7 ONECLASS/SVDD_L1_SEMIGD_BATCH_FIRST_CY_1000 is Algorithm 8. II. Configuration ----------------- Configuration can be made in `lrocsvm/code/script/expconf.py'. 1. runtype in line 4: the settings to run. 2. nlist in line 17: the nu values to run. For SVDD, this nu is converted to the corresponding C for each dataset by C=1/(nu*l) where l is the number of instances of the dataset. 3. two rlist in lines 24 and 25: If r is in (0, 1], then it corresponds to the R value in the paper, meaning we run r*l inner CD steps. If r > 1, then it corresponds to the r value in the paper, meaning we run r inner CD steps. random_greedy_rlist is the rlist for Algorithm 4. greedy_random_rlist is the rlist for Algorithm 5. 4. dataset in line 31: the dataset to run. Run Experiment Parallelly ========================= There are two ways to run the experiment parallelly, from `GNU Parallel' or `grid'. I. GNU Parallel --------------- GNU Parallel (https://www.gnu.org/software/parallel/) is a shell tool for executing jobs in parallel using one or more computers. 1. After `out_run.bash' is generated, run each line in the script in parallel by > parallel --jobs $NR_JOBS < out_run.bash where $NR_JOBS is the number of jobs you want to run in paralle. II. grid -------- `grid' uses shared memory so that running experiments of a same data set in parallel does not waste storing duplicate data set in the memory. It requires OpenMP. 1. Comment lines 3 and 9, and uncomment lines 4 and 10 of `Makefile'. Then run `make'. 2. Modify line 43 (PROCESS_MAX) of `./script/expconf.py' to the number of cores you want to run in parallel. 3. Instead of `./runexp.py time', run `./runexp.py grid'. It will generate `out_run.bash' and several `grid_*.job' files. 4. Run `bash out_run.bash'.
About
An experimental project for adding linear one-class SVM in LIBLINEAR
Resources
Stars
Watchers
Forks
Packages 0
No packages published