GitHub - danamira/data-challenge2

JBG050 Data Challenge 2 | Group 12

💡 Project Description

This repository contains the Python source code and the .ipynb notebooks containing the analysis of team 12 for the course JBG050.

Given a large set of Police datasets, the project was aiming towards enhancing the level of trust and confidence of London's citizens in the Metropolitan Police.

💻 Code Structure

Most of the code is available in Jupyter notebooks and therefore, the user is required to have a Jupyter client installed. Please follow the guidance of this link in order to install Jupyter.

👨🏽‍💻 Installation and Setup

We encourage the user to use a virtual environment to run this project locally. This is to isolate the environment in which the codebase and it's dependencies run from the original host environment.

To achieve this you can choose to either use Python venv or a Conda environment.

The source-code is compatible with Python version 3.12.0.

Install Python pip to manage dependencies.
Run pip install -r requirements.txt in your terminal environment (from the project root) to install the associated dependencies.
Place the data correctly.

To ensure that the stakeholder data is not shared with public or non-authorized parties, we do not include the datasets associated with analysis in this repository. The user must place the data in the corresponding sub-folders located in the data directory.
Run the pre-processing.
1. Run merge-sas.py to merge all the Stop-and-Search datasets. Simply run python merge-sas.py form your terminal environment.
2. Within each notebook, perform the pre-processing required prior to running the other cells.
Fetch the additional datasets by running src/data_utils/download_metro_police_data.py.

📊 Running the analysis

There are multiple models trained and evaluated for this project. Most of which are explained thoroughly in the technical report. To reproduce the results, the notebooks have been named in a clear way to explain what each file is responsible for. You can simply locate the file for which you're interested in knowing the results and then running the notebook cells.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
notebooks		notebooks
pas_data_ward_level/__MACOSX		pas_data_ward_level/__MACOSX
plots		plots
src/data_utils		src/data_utils
.DS_Store		.DS_Store
.gitignore		.gitignore
Crime analysis.ipynb		Crime analysis.ipynb
Ethnicity.ipynb		Ethnicity.ipynb
Low_trust		Low_trust
Low_trust.ipynb		Low_trust.ipynb
MOPAC_v2.ipynb		MOPAC_v2.ipynb
MOPAC_v5.ipynb		MOPAC_v5.ipynb
README.md		README.md
RandomForest.ipynb		RandomForest.ipynb
SAS.ipynb		SAS.ipynb
Stop_and_Search_v1.ipynb		Stop_and_Search_v1.ipynb
cr.ipynb		cr.ipynb
importance_PAS - confidence.ipynb		importance_PAS - confidence.ipynb
importance_PAS.ipynb		importance_PAS.ipynb
merge_sas.py		merge_sas.py
mr.ipynb		mr.ipynb
pas_exploration.ipynb		pas_exploration.ipynb
requirements.txt		requirements.txt
sas_analysis.ipynb		sas_analysis.ipynb
tutorial3.Rmd		tutorial3.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JBG050 Data Challenge 2 | Group 12

💡 Project Description

💻 Code Structure

👨🏽‍💻 Installation and Setup

📊 Running the analysis

About

Releases

Packages

Contributors 3

Languages

danamira/data-challenge2

Folders and files

Latest commit

History

Repository files navigation

JBG050 Data Challenge 2 | Group 12

💡 Project Description

💻 Code Structure

👨🏽‍💻 Installation and Setup

📊 Running the analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages