Add apidocs and wiki files.

viveksinha · Mar 22, 2017 · 80a7df4 · 80a7df4
1 parent f49ee52
commit 80a7df4
Show file tree

Hide file tree

Showing 25 changed files with 4,012 additions and 1,983 deletions.
diff --git a/doc/apidocs.zip b/doc/apidocs.zip
diff --git a/doc/wiki_en/AlgorithmList.md b/doc/wiki_en/AlgorithmList.md
diff --git a/doc/wiki_en/CLIWalkthrough.md b/doc/wiki_en/CLIWalkthrough.md
@@ -0,0 +1,72 @@
+# CommandLine walkthrough
+In this section, we demonstrate how to use command line to read and save data, and models. Also, we demonstrate how to make recommendations by using the arguments typed from the command line.
+
+## Usage
+
+```
+Usage: librec <command> [options]...
+commands:
+  rec                       run recommender
+  data                      load data
+
+global options:
+  --help                    display this help text
+  --exec                    run Recommender
+  --version                 show Librec version info
+
+job options:
+  -conf <file>              path to config file
+  -D, -jobconf <prop>       set configuration items (key=value)
+  -libjars                  add entend jar files to classpath
+```
+
+## Training
+
+```
+./librec rec -exec -D rec.recommender.class=globalaverage -conf ../core/src/main/resources/rec/baseline/globalaverage-test.properties -libjars ../lib/log4j-1.2.17.jar
+```
+rec/data: specify the recommendation algorithm and data input
+
+-exec: execute recommendation algorithms, reserved in the verison 2.0
+
+
+-D | -jobconf [options] corresponding configurations. For detailed configurations, please refer to [ConfigurationList](./ConfigurationLlist) and [AlgorithmList](./AlgorithmList)
+
+-conf: [path/to/properties] load configuration files
+
+-libjars: load the jar libraries from other paths to the classpath, specifically, the jar files in the directory of lib are loaded automatically. The examples are shown bellow.
+
+## Configuration files
+In LibRec, the configurations of a program are stored in core/librec.properties and corresponding algorithm configuration files, respectively. Especially, librec.properties stores the configurations of data input, spliting and evaluation.
+Configuration examples are shown as follows.
+
+```
+# set data directoy
+dfs.data.dir=../data
+
+# set result directory
+# recommender result will output in this folder
+dfs.result.dir=../result
+
+# not implement in this version
+# instead of printing logs in console
+dfs.log.dir=../log
+
+data.input.path=filmtrust
+data.column.format=UIR
+data.model.splitter=ratio
+data.model.format=text
+data.splitter.ratio=rating
+
+# splitter, reference to basics.md
+data.splitter.ratio=0.8
+data.splitter.cv.number=5
+rec.parallel.support=true
+
+# setting evaluation, reference to basics.md
+rec.eval.enable=true
+rec.random.seed=1
+```
+
+Since different algorithms have various configurations, the configuration of an algorithm is stored in the corresponding directory of the algorithm. Users can type arguments from the command line, or use the modified configuration file when execute the algorithm. 
+For the detailed configurations of each algorithm, please refer to [AlgorithmList](./AlgorithmList)
diff --git a/doc/wiki_en/ConfigurationList.md b/doc/wiki_en/ConfigurationList.md
@@ -0,0 +1,89 @@
+# Configuration list
+
+```python
+# set data directory
+dfs.data.dir=../data
+# set result directory
+# recommender result will output in this folder
+dfs.result.dir=../result
+# set log directory
+dfs.log.dir=../log
+
+# convertor
+# load data and splitting data
+# into two (or three) set
+# setting dataset name
+data.input.path=filmtrust
+# setting dataset format(UIR, UIRT)
+data.column.format=UIR
+# setting method of split data
+# value can be ratio, loocv, given, KCV
+data.model.splitter=ratio
+#data.splitter.cv.number=5
+# using rating to split dataset
+data.splitter.ratio=rating
+# filmtrust dataset is saved by text
+# text, arff is accepted
+data.model.format=text
+# the ratio of trainset
+# this value should in (0,1)
+data.splitter.trainset.ratio=0.8
+
+# Detailed configuration of loocv, given, KCV
+# is written in User Guide
+
+# set the random seed for reproducing the results (split data, init parameters and other methods using random)
+# default is set 1l
+# if do not set ,just use System.currentTimeMillis() as the seed and could not reproduce the results.
+rec.random.seed=1
+
+# binarize threshold mainly used in ranking
+# -1.0 - maxRate, binarize rate into -1.0 and 1.0
+# binThold = -1.0， do nothing
+# binThold = value, rating > value is changed to 1.0 other is 0.0, mainly used in ranking
+# for PGM 0.0 maybe a better choose
+data.convert.binarize.threshold=-1.0
+
+# evaluation the result or not
+rec.eval.enable=true
+
+# specifies evaluators
+# rec.eval.classes=auc,precision,recall...
+# if rec.eval.class is blank
+# every evaluator will be calculated
+# rec.eval.classes=auc,precision,recall
+
+# evaluator value set is written in User Guide
+# if this algorithm is ranking only true or false
+rec.recommender.isranking=false
+
+#can use user,item,social similarity, default value is user, maximum values:user,item,social
+#rec.recommender.similarities=user
+```
+## others
+
+### random
+To guarantee the generated results are reproducible, initialization of the random number is set by the 'rec.random.seed' configuration. Examples are shown as follows.
+
+```
+rec.random.seed=1
+```
+
+The Java example code is shown bellow.
+
+```java
+conf.set("rec.random.seed","1");
+```
+
+### verbose
+Algorithm status can be printed out for part of the algorithms in each iteration. The corresponding configuration is 'rec.recommender.verbose'. The example is shown as follows.
+
+```java
+rec.recommender.verbose=true
+```
+
+The Java example code is shown bellow.
+
+```java
+conf.set("rec.recommender.verbose","true")
+```
diff --git a/doc/wiki_en/Context.md b/doc/wiki_en/Context.md
@@ -0,0 +1,47 @@
+# Librec
+LibRec (http://www.librec.net) is an advanced Java open source library of recommender systems with around 70 multiple recommendation algorithms, which can effectively solve the rating and ranking problems. Recommender system is a typical application of Machine Learning and Big Data, which is used to provide personalized recommendations. As an implementation of recommendation algorithms, LibRec 2.0.0-RC improves a lot in modularization, implementation, and availability. In addition, the recommendation performance is further enhanced.
+
+## [[Introduction]]
++ Overview
+- Features
+- Getting started
+    - Clone source code from github
+    - Run a recommender in console
+    - Run a recommender in IDE
+    - What happened
+- Need Help?
+
+## [[CLI walkthrough]]
+  + Usage
+  + Running
+  + Configuration file
+
+## Build-in dataset
+  + [[Data file format]]
+     - Text
+     - Arff
+  + [[FilmTrust]]
+
+
+## Modules
++ [[DataModel]]
+  - Convertor
+  - Splitter
+  - Appender
++ [[Recommender]]
+    - Similarity
+    - Algorithms
+        - Probabilistic Graphical Recommender
+        - Matrix Factorization Recommender
+        - Factorization Machine Recommender
+        - Social Recommender
+        - Tensor Recommender
+    - Implement your own algorithm
++ [[Evaluator]]
++ [[Filter]]
+
+
+## Appendix
++ [[Algorithm list]]
++ [[Configuration list]]
+
diff --git a/doc/wiki_en/DataFileFormat.md b/doc/wiki_en/DataFileFormat.md
@@ -0,0 +1,58 @@
+# DataFileFormat
+
+---
+
+## Text
+LibRec can read the Text data directly, where the data is stored with three or four columns. Every row is a user-item-rating triple or a user-item-rating-date quadruple. The different column is split by spaces or a comma. Examples are demonstrated as follows.
+User-Item-Rating
+
+```
+1050 215 3
+1050 250 2
+1050 251 2.5
+```
+
+User-Item-Rating-Date
+
+```
+1 1 2	97
+1 1 3	75
+1 1 4	76
+1 4 3	87
+1 5 4	96
+```
+Specifically, User-Item-Rating is abbreviated as UIR, and User-Item-Rating-Date is abbreviated as UIRT. Users can set the following configurations, when adopt the Text format as data input.
+
+```
+data.model.format=text
+data.column.format=UIR #or UIRT
+```
+
+## Arff
+When data columns are larger than four, the Arff data format is recommended to store the data. The very top line of the Arff data defines the name of a data set. Each following line is the column name and data type. Examples are shown bellow.
+
+```
+@RELATION user-movie
+
+@ATTRIBUTE user NUMERIC
+@ATTRIBUTE item NUMERIC
+@ATTRIBUTE time NUMERIC
+@ATTRIBUTE rating NUMERIC
+
+@DATA
+1,1,97,2
+1,1,75,3
+1,1,76,4
+1,4,87,3
+1,5,96,4
+1,6,78,3.5
+1,7,1,3.5
+```
+
+In the Arff data format, comments are initiated with %, and declarations are not case-sensitive. For the detailed Arff data format, please refer to [Attribute-Relation File Format](http://www.cs.waikato.ac.nz/ml/weka/arff.html).
+
+Users need to set the following configuration when apply the Arff data format in LibRec.
+
+```
+data.model.format=arff
+```