This is a code for cross-domain recommendation. It supports using multiple source domains to improve the performance on on one target domain. The code is based spectral collaborative filtering method.
If you use the code, please cite our paper
@inproceedings{liu2019jscn,
title={JSCN: Joint spectral convolutional network for cross domain recommendation},
author={Liu, Zhiwei and Zheng, Lei and Zhang, Jiawei and Han, Jiayu and Philip, S Yu},
booktitle={2019 IEEE International Conference on Big Data (Big Data)},
pages={850--859},
year={2019},
organization={IEEE}
}
Joint spectral convolutional network for cross domain recommendation with
We present how to run the code for JSCN-beta with sigle source domain and multiple source domains as JSCN_beta_s1.py, JSCN_beta_s2.py respectively.
The data is from two domains:
The target domain: Amazon Instant Video
The source domain: Apps for Android
Tensorflow = 1.4.1
Python = 3.6
$ python run.py
-
It may take a few minutes to compute the eigenvectors at the first time of computation. Then the eigenvectors are saved locally and do not require computation later.
-
After 200 epoch, the model will be evaluated by testing the MAP and Recall
There are several important part you may need to change:
params.py
: the metaName-1 is the target domain file name, the metaName-2,3,4,... is the source domain file name. The format of the data can be found in data section. ThecommonUserFileName
denotes the alignment of users in target domain and source domain.commonUserFileName_12
means the alignment between metaName-1 and meta-Name-2.commonuser_file.pickle
, Generate the common user alignment pickle list
We use the Amazon_rating_data_set
, which can be downloaded here.
The processing file is ./data/amazon/preprocess.py
and the using cases are in ./data/amazon/dataPreprocessing.ipynb
.
To use for new datasets, you may need to create a cross domain training data.
-
The rating files:
SourcedomainRating.txt
,targetdomainRating.txt
, the name should be change according to the name in theparams.py
file:
Each row is user_id, basket_id, basket_id, ... -
The commonuser files:
commonuser.pickle
: It is a python tuple list :[(uid_in_target_1, uid_in_source_1),(uid_in_target_2, uid_in_source_2),....,]
which tells the model how the users in target domain are aligned with users in source domains.
You should generate these files first if you want to run the code on new datasets.
The maximum number of source domains is currently 2 by using JSCN_beta_s2