Skip to content

Commit

Permalink
Add apidocs and wiki files.
Browse files Browse the repository at this point in the history
  • Loading branch information
wangyufengkevin committed Mar 22, 2017
1 parent f49ee52 commit 80a7df4
Show file tree
Hide file tree
Showing 25 changed files with 4,012 additions and 1,983 deletions.
Binary file modified doc/apidocs.zip
Binary file not shown.
852 changes: 852 additions & 0 deletions doc/wiki_en/AlgorithmList.md

Large diffs are not rendered by default.

72 changes: 72 additions & 0 deletions doc/wiki_en/CLIWalkthrough.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# CommandLine walkthrough
In this section, we demonstrate how to use command line to read and save data, and models. Also, we demonstrate how to make recommendations by using the arguments typed from the command line.

## Usage

```
Usage: librec <command> [options]...
commands:
rec run recommender
data load data
global options:
--help display this help text
--exec run Recommender
--version show Librec version info
job options:
-conf <file> path to config file
-D, -jobconf <prop> set configuration items (key=value)
-libjars add entend jar files to classpath
```

## Training

```
./librec rec -exec -D rec.recommender.class=globalaverage -conf ../core/src/main/resources/rec/baseline/globalaverage-test.properties -libjars ../lib/log4j-1.2.17.jar
```
rec/data: specify the recommendation algorithm and data input

-exec: execute recommendation algorithms, reserved in the verison 2.0


-D | -jobconf [options] corresponding configurations. For detailed configurations, please refer to [ConfigurationList](./ConfigurationLlist) and [AlgorithmList](./AlgorithmList)

-conf: [path/to/properties] load configuration files

-libjars: load the jar libraries from other paths to the classpath, specifically, the jar files in the directory of lib are loaded automatically. The examples are shown bellow.

## Configuration files
In LibRec, the configurations of a program are stored in core/librec.properties and corresponding algorithm configuration files, respectively. Especially, librec.properties stores the configurations of data input, spliting and evaluation.
Configuration examples are shown as follows.

```
# set data directoy
dfs.data.dir=../data
# set result directory
# recommender result will output in this folder
dfs.result.dir=../result
# not implement in this version
# instead of printing logs in console
dfs.log.dir=../log
data.input.path=filmtrust
data.column.format=UIR
data.model.splitter=ratio
data.model.format=text
data.splitter.ratio=rating
# splitter, reference to basics.md
data.splitter.ratio=0.8
data.splitter.cv.number=5
rec.parallel.support=true
# setting evaluation, reference to basics.md
rec.eval.enable=true
rec.random.seed=1
```

Since different algorithms have various configurations, the configuration of an algorithm is stored in the corresponding directory of the algorithm. Users can type arguments from the command line, or use the modified configuration file when execute the algorithm.
For the detailed configurations of each algorithm, please refer to [AlgorithmList](./AlgorithmList)
89 changes: 89 additions & 0 deletions doc/wiki_en/ConfigurationList.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Configuration list

```python
# set data directory
dfs.data.dir=../data
# set result directory
# recommender result will output in this folder
dfs.result.dir=../result
# set log directory
dfs.log.dir=../log

# convertor
# load data and splitting data
# into two (or three) set
# setting dataset name
data.input.path=filmtrust
# setting dataset format(UIR, UIRT)
data.column.format=UIR
# setting method of split data
# value can be ratio, loocv, given, KCV
data.model.splitter=ratio
#data.splitter.cv.number=5
# using rating to split dataset
data.splitter.ratio=rating
# filmtrust dataset is saved by text
# text, arff is accepted
data.model.format=text
# the ratio of trainset
# this value should in (0,1)
data.splitter.trainset.ratio=0.8

# Detailed configuration of loocv, given, KCV
# is written in User Guide

# set the random seed for reproducing the results (split data, init parameters and other methods using random)
# default is set 1l
# if do not set ,just use System.currentTimeMillis() as the seed and could not reproduce the results.
rec.random.seed=1

# binarize threshold mainly used in ranking
# -1.0 - maxRate, binarize rate into -1.0 and 1.0
# binThold = -1.0, do nothing
# binThold = value, rating > value is changed to 1.0 other is 0.0, mainly used in ranking
# for PGM 0.0 maybe a better choose
data.convert.binarize.threshold=-1.0

# evaluation the result or not
rec.eval.enable=true

# specifies evaluators
# rec.eval.classes=auc,precision,recall...
# if rec.eval.class is blank
# every evaluator will be calculated
# rec.eval.classes=auc,precision,recall

# evaluator value set is written in User Guide
# if this algorithm is ranking only true or false
rec.recommender.isranking=false

#can use user,item,social similarity, default value is user, maximum values:user,item,social
#rec.recommender.similarities=user
```
## others

### random
To guarantee the generated results are reproducible, initialization of the random number is set by the 'rec.random.seed' configuration. Examples are shown as follows.

```
rec.random.seed=1
```

The Java example code is shown bellow.

```java
conf.set("rec.random.seed","1");
```

### verbose
Algorithm status can be printed out for part of the algorithms in each iteration. The corresponding configuration is 'rec.recommender.verbose'. The example is shown as follows.

```java
rec.recommender.verbose=true
```

The Java example code is shown bellow.

```java
conf.set("rec.recommender.verbose","true")
```
47 changes: 47 additions & 0 deletions doc/wiki_en/Context.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Librec
LibRec (http://www.librec.net) is an advanced Java open source library of recommender systems with around 70 multiple recommendation algorithms, which can effectively solve the rating and ranking problems. Recommender system is a typical application of Machine Learning and Big Data, which is used to provide personalized recommendations. As an implementation of recommendation algorithms, LibRec 2.0.0-RC improves a lot in modularization, implementation, and availability. In addition, the recommendation performance is further enhanced.

## [[Introduction]]
+ Overview
- Features
- Getting started
- Clone source code from github
- Run a recommender in console
- Run a recommender in IDE
- What happened
- Need Help?

## [[CLI walkthrough]]
+ Usage
+ Running
+ Configuration file

## Build-in dataset
+ [[Data file format]]
- Text
- Arff
+ [[FilmTrust]]


## Modules
+ [[DataModel]]
- Convertor
- Splitter
- Appender
+ [[Recommender]]
- Similarity
- Algorithms
- Probabilistic Graphical Recommender
- Matrix Factorization Recommender
- Factorization Machine Recommender
- Social Recommender
- Tensor Recommender
- Implement your own algorithm
+ [[Evaluator]]
+ [[Filter]]


## Appendix
+ [[Algorithm list]]
+ [[Configuration list]]

58 changes: 58 additions & 0 deletions doc/wiki_en/DataFileFormat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# DataFileFormat

---

## Text
LibRec can read the Text data directly, where the data is stored with three or four columns. Every row is a user-item-rating triple or a user-item-rating-date quadruple. The different column is split by spaces or a comma. Examples are demonstrated as follows.
User-Item-Rating

```
1050 215 3
1050 250 2
1050 251 2.5
```

User-Item-Rating-Date

```
1 1 2 97
1 1 3 75
1 1 4 76
1 4 3 87
1 5 4 96
```
Specifically, User-Item-Rating is abbreviated as UIR, and User-Item-Rating-Date is abbreviated as UIRT. Users can set the following configurations, when adopt the Text format as data input.

```
data.model.format=text
data.column.format=UIR #or UIRT
```

## Arff
When data columns are larger than four, the Arff data format is recommended to store the data. The very top line of the Arff data defines the name of a data set. Each following line is the column name and data type. Examples are shown bellow.

```
@RELATION user-movie
@ATTRIBUTE user NUMERIC
@ATTRIBUTE item NUMERIC
@ATTRIBUTE time NUMERIC
@ATTRIBUTE rating NUMERIC
@DATA
1,1,97,2
1,1,75,3
1,1,76,4
1,4,87,3
1,5,96,4
1,6,78,3.5
1,7,1,3.5
```

In the Arff data format, comments are initiated with %, and declarations are not case-sensitive. For the detailed Arff data format, please refer to [Attribute-Relation File Format](http://www.cs.waikato.ac.nz/ml/weka/arff.html).

Users need to set the following configuration when apply the Arff data format in LibRec.

```
data.model.format=arff
```
Loading

0 comments on commit 80a7df4

Please sign in to comment.