Pilosa is a bitmap index database.
Pilosa requires Go 1.7 or greater.
You can download the source by running go get
:
$ go get github.com/pilosa/pilosa
Now you can install the pilosa
binary:
$ go install github.com/pilosa/pilosa/cmd/...
Now run a single pilosa node with the default configuration:
pilosa server
You can specify a configuration by setting the -config
flag when running pilosa
.
pilosa server --config custom-config-file.cfg
The config file uses the TOML configuration file format, and should look like:
data-dir = "/tmp/pil0"
host = "127.0.0.1:15000"
[cluster]
replicas = 2
[[cluster.node]]
host = "127.0.0.1:15000"
[[cluster.node]]
host = "127.0.0.1:15001"
You can generate a template config file with default values with:
pilosa config
The first two configuration options will be unique to each node in the cluster:
data-dir
: directory in which data is stored to disk
host
: IP and port of the pilosa node
The remaining configuration options should be the same on every node in the cluster.
replicas
: the number of replicas within the cluster
[[cluster.node]]
: specifies each node within the cluster
You can create a Pilosa container using make docker
or equivalently:
docker build -t pilosa:latest .
You can run a temporary container using:
docker run -it --rm --name pilosa -p 15000:15000 pilosa:latest
When you click Ctrl+C
to stop the container, the container and the data in the container will be erased. You can leave out --rm
flag to keep the data in the container. See Docker documentation for other options.
You can interact with Pilosa via HTTP requests to the host:port on which you have Pilosa running.
The following examples illustrate how to do this using curl
with a Pilosa cluster running on
127.0.0.1 port 15000.
Return the version of Pilosa:
$ curl "http://127.0.0.1:15000/version"
Return a list of all databases and frames in the index:
$ curl "http://127.0.0.1:15000/schema"
Before running a query, the corresponding database and frame must be created. Note that database and frame names can contain only lower case letters, numbers, dash (-
), underscore (_
) and dot (.
).
You can create the database sample-db
using:
$ curl -XPOST "http://127.0.0.1:15000/db" \
-d '{"db": "sample-db"}'
Optionally, you can specify the column label on database creation:
$ curl -XPOST "http://127.0.0.1:15000/db" \
-d '{"db": "sample-db", "columnLabel": "user"}'
The frame collaboration
may be created using the following call:
$ curl -XPOST "http://127.0.0.1:15000/frame" \
-d '{"db": "sample-db", "frame": "collaboration"}'
It is possible to specify the frame row label on frame creation:
$ curl -XPOST "http://127.0.0.1:15000/frame" \
-d '{"db": "sample-db", "frame": "collaboration"}, "options": {"rowLabel": "project"}}'
Queries to Pilosa require sending a POST request where the query itself is sent as POST data.
You specify the database on which to perform the query with a URL argument db=database-name
.
In this section, we assume both the database sample-db
with column label user
and the frame collaboration
with row label project
was created.
A query sent to database sample-db
will have the following format:
$ curl -X POST "http://127.0.0.1:15000/query?db=sample-db" -d 'Query()'
The Query()
object referenced above should be made up of one or more of the query types listed below.
So for example, a SetBit() query would look like this:
$ curl -X POST "http://127.0.0.1:15000/query?db=sample-db" -d 'SetBit(project=10, frame="collaboration", user=1)'
Query results have the format {"results":[]}
, where results
is a list of results for each Query()
. This
means that you can provide multiple Query()
objects with each HTTP request and results
will contain
the results of all of the queries.
$ curl -X POST "http://127.0.0.1:15000/query?db=sample-db" -d 'Query() Query() Query()'
SetBit(project=10, frame="collaboration", user=1)
A return value of {"results":[true]}
indicates that the bit was toggled from 0 to 1.
A return value of {"results":[false]}
indicates that the bit was already set to 1 and therefore nothing changed.
SetBit accepts an optional timestamp
field:
SetBit(project=10, frame="collaboration", user=2, timestamp="2016-12-11T10:09:07")
ClearBit(project=10, frame="collaboration", user=1)
A return value of {"results":[true]}
indicates that the bit was toggled from 1 to 0.
A return value of {"results":[false]}
indicates that the bit was already set to 0 and therefore nothing changed.
SetBitmapAttrs(project=10, frame="collaboration", stars=123, url="http://projects.pilosa.com/10", active=true)
Returns {"results":[null]}
SetProfileAttrs(user=10, friends=123, username="mrpi", active=true)
Returns {"results":[null]}
Bitmap(project=10, frame="collaboration")
Returns {"results":[{"attrs":{"stars":123, "url":"http://projects.pilosa.com/10", "active":true},"bits":[1,2]}]}
where attrs
are the
attributes set using SetBitmapAttrs()
and bits
are the bits set using SetBit()
.
In order to return profile attributes attached to the profiles of a bitmap, add &profiles=true
to the query string. Sample response:
{"results":[{"attrs":{},"bits":[10]}],"profiles":[{"user":10,"attrs":{"friends":123, "username":"mrpi", "active":true}}]}
Union(Bitmap(project=10, frame="collaboration"), Bitmap(project=20, frame="collaboration")))
Returns a result set similar to that of a Bitmap()
query, only the attrs
dictionary will be empty: {"results":[{"attrs":{},"bits":[1,2]}]}
.
Note that a Union()
query can be nested within other queries anywhere that you would otherwise provide a Bitmap()
.
Intersect(Bitmap(project=10, frame="collaboration"), Bitmap(project=20, frame="collaboration")))
Returns a result set similar to that of a Bitmap()
query, only the attrs
dictionary will be empty: {"results":[{"attrs":{},"bits":[1]}]}
.
Note that an Intersect()
query can be nested within other queries anywhere that you would otherwise provide a Bitmap()
.
Difference(Bitmap(project=10, frame="collaboration"), Bitmap(project=20, frame="collaboration")))
Difference()
represents all of the bits that are set in the first Bitmap()
but are not set in the second Bitmap()
. It returns a result set similar to that of a Bitmap()
query, only the attrs
dictionary will be empty: {"results":[{"attrs":{},"bits":[2]}]}
.
Note that a Difference()
query can be nested within other queries anywhere that you would otherwise provide a Bitmap()
.
Count(Bitmap(project=10, frame="collaboration"))
Returns the count of the number of bits set in Bitmap()
: {"results":[28]}
Range(project=10, frame="collaboration", start="1970-01-01T00:00", end="2000-01-02T03:04")
TopN(frame="geo")
Returns all Bitmaps in the cache from frame geo
sorted by the count of bits.
TopN(frame="geo", n=20)
Returns the top 20 Bitmaps from frame geo
.
TopN(Bitmap(project=10, frame="collaboration"), frame="geo", n=20)
Returns the top 20 Bitmaps from geo
sorted by the count of bits in the intersection with Bitmap(project=10)
.
TopN(Bitmap(project=10, frame="collaboration"), frame="geo", n=20, field="category", [81,82])
<<<<<<< HEAD
Returns the top 20 Bitmaps from geo
in attribute category
with values 81 or 82
sorted by the count of bits in the intersection with Bitmap(project=10)
.
Returns the top 20 Bitmaps from bar
in attribute category
with values 81 or 82
sorted by the count of bits in the intersection with Bitmap(id=10)
.
master
To update dependencies, you'll need to install Glide.
Then add the new dependencies in your project:
$ glide get github.com/foo/bar
If you update protobuf (pilosa/internal/internal.proto), then you need to run go generate
$ go generate
In order to set the version number, compile Pilosa with the following argument:
$ go install --ldflags="-X main.Version=1.0.0"