Sometimes you want to ship some static data with a Haskell executable, such as:
- numerical lookup tables
- whitelists
- example data
This is especially relevant if you are building static executables for as-self-contained-as-possible deployments, such as for Amazon Lambda, or to make life for your users easy.
This example code shows how you can use SQLite's append VFS, which
allows an SQLite database to be appended onto the end of some other file, such as an executable
An SQLite database can be conveniently added to your Haskell executable using the sqlite3
command line tool with the --append
flag.
You can then open it from Haskell in read-only mode.
After opening it as shown in app/Main.hs
, you can use it like any other SQLite DB, such as with the sqlite-simple
package.
Build the example exe
using stack build
.
Use sqlite3 --append
to append a DB with some sample static data to it:
sqlite3 --append $(stack path --dist-dir)/build/exe/exe \
"CREATE TABLE testtable (field1 TEXT);
INSERT INTO testtable (field1) VALUES ('hello'), ('world');"
Run exe
to see that it successfully reads the data:
$ $(stack path --dist-dir)/build/exe/exe
Database contents:
"hello"
"world"
#include appendvfs.c
- Calling
sqlite3_appendvfs_init(0,0,0)
- Documenting the
--append FILE
switch - Passing the
apndvfs
open flag tosqlite3_open_v2()
- You can also use the older
sqlite3_open()
with?vfs=apndvfs
ifSQLITE_USE_URI
is enabled (it is enabled by default; docs)
- You can also use the older
There are other ways how you can include static data into your executables:
- TemplateHaskell like
file-embed
- splices large
ByteString
"literals"
into source code - can be slow at compile-time because the compiler has to parse it
- can be not-so-fast at run-time because data structures (such as
Map
s) have to be re-parsed/deserialised from the ByteStrings (at startup) - changing the data requires recompilation
- Use of TemplateHaskell can trigger The
TH
recompilation problem - alternatively, you can avoid recompilation if the amont of Bytes to embed is constant, using
dummySpace
andinjectFile
- splices large
- At link time using a custom assembly script
- Shown in Sylvain Henry's blog post "Fast file embedding with GHC!"
- fast at compile-time
- can be not-so-fast at run-time because data structures (such as
Map
s) have to be re-parsed/deserialised from the ByteStrings (at startup) - changing the data requires re-linking
- Storing serialised compact regions using the above methods
- Using
Data.Compact.Serialize
fromcompact
- fast at run-time because data structures (such as
Map
s) do not have to be re-parsed/deserialised - cannot store some data (e.g. crashes on
ByteStrings
because they are pinned memory) - very new and untested, recent bugs have been found (in 2019/2020)
- Using
The approach shown here:
- SQLite's
vfs=apndvfs
- fast at compile-time
- fast at run-time because data structures (such as
Map
s) do not have to be re-parsed/deserialised- instead, SQL queries can be directly made
- performance is that of SQLite
sqlite3
can be used to inspect/manipulate the data (works on all platforms)- changing the data requires no recompilation or re-linking
- requires linking in
appendvfs.c
(see.cabal
file)