In this demo, you will do the following in phases.
- Phase 1 - FHIR Setup (~90 minutes)
- download a non-trivial healthcare data set, this demo uses the MITRE coherent set (20 minutes).
- start a dockerized hapi FHIR server (1 minute).
- load patient data into the server (45 minutes).
- bulk export the same data to split it into files by resource type (~15 minutes).
- Phase 2 - OMOPCDM Setup (~90 minutes)
- create an empty OMOPCDM database (1 minute)
- download terminology data from Athena (~1 hour)
- load terminology into the empty database (20 minutes)
- Phase 3 - Translate FHIR to OMOP
- translate the bulk exported FHIR data to OMOP
- Phase 4 - Load OMOPCDM
- load the converted data into the OMOPCDM database
This demo assumed you have a few tools at your disposal already.
This demo takes a few hours to complete, depending on your OS and hardware. Getting Docker set up may be a challenge if you don't have it already, and there is a fair amount of time spent waiting for downloads to download and for data to load into and out of the FHIR server.
You will need at least 70GB of available disk space, but it will be easier for you if you have 100GB. An SSD is highly recommended.
Click to see the pre-demo checklist...
You should have (or install)...
- Docker
- a terminal for running scripts
-
bash
-
jq
-
sqlite
-
zstd
You will also need to download a few pieces of data.
- the MITRE coherent data set
- terminology data from Athena
If you are using apt
you can sudo apt install jq sqlite3 zstd
, or for OSX
homebrew users: brew install jq sqlite3 zstd
.
For this demo, I recommend cloning into a new directory (I prefer working in ~/code
).
# Make yourself a directory to check out the demo.
mkdir -p ~/code
cd ~/code
# Clone the required repos.
gh repo clone barabo/fhir-jq
gh repo clone barabo/fhir-to-omop-demo
# Get ready to start!
cd fhir-to-omop-demo
To test your docker installation, run docker run hello-world
in a terminal.
If it worked, you should see output like this:
$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
478afc919002: Pull complete
Digest: sha256:266b191e926f65542fa8daaec01a192c4d292bff79426f47300a046e1bc576fd
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(arm64v8)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
$
The instructions are outlined in the following READMEs
Once this is complete, you can stop the hapi server. This should cause it to
shrink its H2 database (which is saved in data/hapi/h2
) and free up some disk
space in the meantime.
You can also stop Docker at this point. Docker does consume some disk space while it runs, so if you are already low on available disk space, this is an option.
You should now have a compressed tar file with the full bulk export contents
saved in data/bulk-export/full
!
You won't need the hapi server running anymore, so it is recommended to stop
it to free up disk space. If you know you won't need it anymore, you can
delete the H2 database by running the demo/hapi/nuke.sh
script.
The instructions are outlined in the following README
-
athena/README.md
Once this is completed, you can delete the unzipped terminology files to save space. This is recommended as long as you retail the compressed download file.
You should have a OMOPCDM database with terminology data loaded into it in
data/cdm.db
(which is about 9 GB), and data/empty.db
which is an empty
OMOPCDM database with no terminology loaded into it.
The instructions are outlined in the following README
-
translate/README.md
You can probably delete the uncompressed ndjson from data/bulk-export
, but
you will need it if you want to modify any of the translation logic.
The instructions are outlined in the following README
-
load/README.md
You should be able to connect your database to any of the many existing visualization tools that work with OMOPCDM databases.