This repository allows to download data for demonstration of DYLANs capabilities in practice.
The sample tries to be representative by sourcing data from (primarily) Dutch data providers in various sectors Healthcare, Finance, Science, Education, Traffic, Legislative and Infrastructure.
Providers:
You can download the datasets by running:
python data_sources.py
Installation for Python 3 requires two steps:
-
Install the required Python packages.
- Option 1: create a conda environment
conda env create -f env.yml
- Option 2: install the requirements as listen in
env.yml
via pippip install -U [requirement]
- Option 1: create a conda environment
-
Install Chrome/Chromium and ChromeDriver
The type_analysis
directory contains jupyter notebooks with analyses of the data types of each dataset. These are also an excellent reference implementation for loading the sets.
The profile.py
script generates exploratory data analysis reports for the datasets.