This repository contains the code used to research about the estimation errors in network A/B testing due to sample variance and model misspecification, and how clusterization affects it. A simple new clusterization model is also proposed.
All of the researchers are from the Computer Science Department at Universidade Federal de Minas Gerais (UFMG), Brazil:
- Francisco Galuppo Azevedo - [email protected]
- Bruno Demattos Nogueira - [email protected]
- Fabricio Murai - [email protected]
- Ana Paula Couto Silva - [email protected]
To learn more about our research, read our papers:
Companies that offer services on the Web often rely on randomized experiments known as A/B tests for assessing the impact of development and business decisions. During an experiment, each user is randomly redirected to one of two versions of the website, called treatments. Several response models were proposed to describe the behavior of a user in a social network website as a function of the treatment assigned to her and to her neighbors. However, there is no consensus as to which model should be applied to a given dataset. In this work, we propose a new response model, derive theoretical limits for the estimation error of several models, and obtain empirical results for cases where the response model was misspecified.
![]() |
---|
Mean Squared Error of the estimates of the Probit Model for 3 different networks |
Análise de Algoritmos de Clusterizção para Experimentos Randomizados em Redes Sociais de Larga Escala
Large companies conduct A/B tests to estimate the effect of changes in their websites. In these tests, users are randomly redirected to one of two versions of the site. However, in social networks, users that access different versions can influence each other if they are linked, making estimation more difficult. To minimize this interference, graph partitioning algorithms were proposed to find clusters of well-connected users (e.g. ε-net and FENNEL). All users within a cluster are redirected to the same version. In this work, we propose a parallel variant of ε-net and a new algorithm dubbed NoMAS, inspired on FENNEL. We present a theoretical analysis of the proposed algorithms’ scalability complemented by empirical results on the estimation accuracy.
![]() |
---|
Example for the parallel ε-net algorithm |