This repository is the Python implementation with tensorflow 1.4.0 for the ICWS 2020 accepted paper:
Guosheng Kang, Jianxun Liu, Buqing Cao, Manliang Cao. "NAFM: Neural and Attentional Factorization Machine for Web API Recommendation". IEEE International Conference on Web Services. 2020, pp. 330-337.
Item | Values |
---|---|
Number of Mashups | 6206 |
Number of APIs | 12919 |
Number of invocations | 13107 |
Average number of invocations per Mashup | 2.11 |
Number of called APIs | 940 |
Called proportion of APIs | 7.28% |
Number of interactions | 9297 |
Number of labeled Mashup tags | 18601 |
Number of Mashup tags | 403 |
Number of labeled API tags | 44891 |
Number of API tags | 473 |
Average length of Mashup description | 27.63 |
Average length of API description | 68.98 |
Number of Mashups with included APIs | 5341 |
Sparsity of Mashup-API matrix | 99.81% |
Number of API category | 383 |
Number of Mashup category | 312 |
Number of API/Mashup categories | 427 |
Run programme dataset_characteristics.py to generate the above statistical data.
-
basicFM
- B. Cao, B. Li, J. Liu, M. Tang, and Y. Liu, “Web APIs Recommendation for Mashup Development based on Hierarchical Dirichlet Process and Factorization Machines,” International Conference on Collaborative Computing: Networking, Applications and Worksharing, 2016, pp. 3-15.
- M. M. Rahman, X. Liu, and B. Cao, “Web API Recommendation for Mashup Development Using Matrix Factorization on Integrated Content and Network-based Service Clustering,” 2017 IEEE International Conference on Services Computing (SCC), 2017, pp. 225-232.
- B. Cao, B. Li, J. Liu, M. Tang, Y. Liu, and Y. Li, “Mobile Service Recommendation via Combining Enhanced Hierarchical Dirichlet Process and Factorization Machines,” Mobile Information Systems, vol. 2019, 2019.
-
DeepFM
- X. Zhang, J. Liu, B. Cao, Q. Xiao, and Y. Wen, “Web Service Recommendation via Combining Doc2Vec-based Functionality Clustering and DeepFM-Based Score Prediction,” 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications, 2018, pp. 509-516.
-
AFM:
- Y. Cao, J. Liu, M. Shi, B. Cao, T. Chen, and Y. Wen, “Service Recommendation Based on Attentional Factorization Machine,” International Conference on Services Computing, Year, pp. 189-196.
The state-of-the-art models for Web API recommendation, they either model factorized interactions with the same weight or neglect the the non-linear and complex inherent structure of real-world data. In real-world applications, different predictor variables usually have different predictive power, and not all features contain useful signal for estimating the target. Moreover, complex and non-linear structure are usually underlied in real-world data. Therefore, we propose a hybrid factorization machine model with a novel neural network architecture named NAFM by integrating neural network (i.e., neural component) to capture the non-linear feature interactions and attention network (i.e., attention component) to capture the different importance of feature interactions. We implemented NAFM using tensorflow 1.4.0 based on deepctr package. NAFM code
- Step1: Train Doc2Vec model with all Mashup and API description information, and selected 100 most popular APIs and the associated 1993 Mashups as our experimental data
- Code: get_samples.py
- Input: the raw dataset
- Output: doc2vec.model, samples.pickle
- Step2: Process the experimental data and transform them into input data for models
- Code: get_input_data.py
- Input: doc2vec.model, samples.pickle
- Output: input_data.csv
- Step3: Split the input data into two parts with 80% trainging data and 20% testing data
- Code: process_input_data.py
- Input: input_data.csv
- Output: input_data.pickle
- Step4: Parameter optimization
- Code: parameter_optimization.py
- Input: input_data.pickle
- Output: print the defaulf best parameter values
- Performance comparison
- Code: run_input_data.py
- Input: input_data.csv
- Output: evaluation_results.csv
- Impact of parameters
- Code: impact_of_parameters.py
- Input: input_data.pickle
- Output: Logloss_impact_of_parameters.csv, AUC_impact_of_parameters.csv
- Plot figures for impact of parameters
Models | Architechtures |
---|---|
basicFM | ![]() |
DeepFM | ![]() |
AFM | ![]() |
NAFM | ![]() |
Model | Order-1 Features | Order-2 Features | High-order Feature Interactions | Discriminate the Importance of Order-2 Feature Interactions | High-order Input |
---|---|---|---|---|---|
basicFM | √ | √ | × | × | None |
DeepFM | √ | √ | √ | × | Embedding Vectors |
AFM | √ | √ | × | √ | None |
NAFM | √ | √ | √ | √ | Order-2 Feature Interactions |