This directory contains the small size (10,000 records for each pair of campaign and behavior policy) version of our data that can be used for running our quickstart guide and examples. The full size version of our data is available at https://research.zozo.com/data.html.
This dataset is released along with the paper:
Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita.
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
https://arxiv.org/abs/2008.07146
When using this dataset, please cite the paper with following bibtex:
@article{saito2020open,
title={Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation},
author={Saito, Yuta and Shunsuke, Aihara and Megumi, Matsutani and Yusuke, Narita},
journal={arXiv preprint arXiv:2008.07146},
year={2020}
}
Open Bandit Dataset is constructed in an A/B test of two multi-armed bandit policies on a large-scale fashion e-commerce platform, ZOZOTOWN. It currently consists of a total of about 26M rows, each one representing a user impression with some feature values, selected items as actions, true propensity scores, and click indicators as an outcome. This is especially suitable for evaluating off-policy evaluation (OPE), which attempts to estimate the counterfactual performance of hypothetical algorithms using data generated by a different algorithm.
Here is a detailed description of the fields (they are comma-separated in the CSV files):
{behavior_policy}/{campaign}.csv (behavior_policy in (bts, random), campaign in (all, men, women))
- timestamp: timestamps of impressions.
- item_id: index of items as arms (index ranges from 0-79 in "All" campaign, 0-33 for "Men" campaign, and 0-45 "Women" campaign).
- position: the position of an item being recommended (1, 2, or 3 correspond to left, center, and right position of the ZOZOTOWN recommendation interface, respectively).
- click: target variable that indicates if an item was clicked (1) or not (0).
- action_prob: the probability of an item being recommended at the given position.
- user_features: user-related feature values.
- user_item_affinity: user-item affinity scores induced by the number of past clicks observed between each user-item pair.
Structure of Open Bandit Dataset
item_context.csv
- item_id: index of items as arms (index ranges from 0-80 in "All" campaign, 0-33 for "Men" campaign, and 0-46 "Women" campaign).
- item feature 0-3: item related feature values
Note that user and item features are now anonymized using a hash function.
For any question, feel free to contact:
- The authors of the paper: [email protected]
- ZOZO Research: [email protected]