Given some users and some items, and the history of each customer interacting each item, this solution should predict which user are likely to interact with which item.
Basically, your sales website should be:
- pushing customer purchase/click logs onto Alibaba cloud OSS
- pulling the recommendation result from Maxcompute Data Service for rendering result.
I am following this official guide on PAI platform and created this solution.
Unfortunately, there is no data to reproduce the steps. So I downloaded from this link on UCI a similar dataset for easier reproduction. You download the sample by yourself, or get it from this repository at online_retail_200_rows.csv. In same folder, there is the larger original one.
- [Maxcompute+dataworks] (https://www.alibabacloud.com/product/ide) -- This is an IDE for big data, which works on top of Hadoop (EMR), OSS, Database, etc.
- [PAI] (https://www.alibabacloud.com/product/machine-learning?spm=a2c63.p38356.1389108.dnavproductai3.6fba3679qCWqJ1) -- This is a collection of AI algorithms, with drag and drop design interface.
- [OSS] (https://www.alibabacloud.com/product/oss?spm=a3c0i.7911826.1389108.133.9fbf14b3D5kdU4) -- This is a file storage.
You need to setup your own cridentials for run the samples.
This solution requests 3 input information:
- User List
- Item List
- Interaction history
This solution produce a table with three columns:
- User_id
- Item_id
- Similar_item_id
The result means for each user_id, you can recommend those products from "similar_item_id".
Follow the dataworks guide
This is one sample SQL to transform original transaction into PAI Collaborative Filtering Format.
The SQL in dataworks look like
git clone https://code.aliyun.com/best-practice/140.git
Create a project by PAI official tutorial. ![recommendation_tutorial] (./doc/pai_tutorial_20200401182152.jpg).
The PAI project should look like:
Please follow this DataService studio document to create data service API. Then in your portal, you can call this API to retrive information about a single user.
If you want to setup more advanced recommendation, check out this tutorial as well. The sample data is hosted here:
git clone https://code.aliyun.com/best-practice/140.git
BSD Licensed.
I am using a dataset from UCI for online retail sales. If you know more good content recommendation dataset, let me know, and we will make more approperiate version for advance content.
You may find original dataset from this link on UCI. I acknowledge and thank to provider by this reference:
Daqing Chen, Sai Liang Sain, and Kun Guo, Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining, Journal of Database Marketing and Customer Strategy Management, Vol. 19, No. 3, pp. 197–208, 2012 (Published online before print: 27 August 2012. doi: 10.1057/dbm.2012.17).