This is a Code for Boston project that is trying to predict health-based drinking water violations using the Environmental Protection Agency's Safe Drinking Water Information System.
This project will analyze data from the EPA’s Safe Drinking Water Information System and then integrate other data sets to try to predict health-based drinking water violations in the United States. The Conservation Law Foundation has expressed interest in this project. They might be a potential long-term partner.
At first we start by considering one dataset. Safe Drinking Water Information System.
In a second time, we will explore other datasets from the EPA, including, but not limited to:
- the Toxic Release Inventory database,
- the Superfund Enterprise Management System,
- the Environmental Radiation Monitoring database,
- the Enforcement and Compliance History Outline.
(Feel free to add link to new interesting datasets here)
- Find us on our Slack channel #water.
- Join our Trello board. We use Trello as a project management tool to track tasks in a Kanban.
- We also have a google drive folder with various documentation.
We are using python with pandas, although some people are using R as well.
- The slack pinned items are a really good way to get started. Scroll through them to get more context about what we are doing.
- Take a look at the
docs/
folder for the next steps and technical setup.
This documentation is a living thing. Fork and propose your update 🙏