This repository implements private linear regression models, comparing Differential Privacy (DP) to PAC (Probably Approximately Correct) Privacy and a non-private baseline. Differential privacy is a commonly accepted privacy notion, and the PAC Privacy framework is more recent.
The source code is located in GitHub, at the link: https://github.com/hillaryyang/priv_lr
We use Python to implement
- DP stochastic gradient descent using Opacus
- PAC Privacy framework
as well as non-private baselines.
datasets/
: Contains the 3 datasets (Lenses, Concrete, Automobiles) used for training and testingnp/
: Code for non private baselinedp/
: Code for DP linear regressiondp/private.py
: Updates training objects using Opacus' privacy enginedp/grid_search/
: Contains scripts for grid search for DP SGD optimal hyperparameters
pac/
: Code for PAC linear regressionpac/private.py
: Privatization functions such as noise estimation for membership privacypac/alpha_search/
: Scripts for tuning hyperparameter alpha for regularized linear regression
data_loader.py
: Utility script for loading/preprocessing datasets
- Clone the repository:
git clone https://github.com/hillaryyang/priv_lr.git
cd priv_lr
- For MacOS, create and activate a virtual environment
python -m venv ~/env
source ~/env/bin/activate
- Install dependencies
pip install -r requirements.txt
After inputting the desired hyperparameters in the Python files, run the code for DP/PAC/non-private:
- DP:
python3 dp/<dataset>_dp.py
- PAC:
python3 pac/<dataset>_pac.py
- Non-private:
python3 np/<dataset>_np.py
Results are printed to the command line.