Skip to content

Commit

Permalink
Benchmark 02/2018
Browse files Browse the repository at this point in the history
  • Loading branch information
weinberger authored and mchacki committed Feb 14, 2018
1 parent a1f1738 commit 19cb07b
Show file tree
Hide file tree
Showing 34 changed files with 2,572 additions and 484 deletions.
103 changes: 49 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,80 @@
# NoSQL Performance Tests

This repository contains the performance tests described in my [blog](https://www.arangodb.com/2015/06/multi-model-benchmark/). Please feel free to improve the various database test drivers. If you see any optimization I have missed, please issue a pull request.
This repository contains the performance tests described in my [blog](https://www.arangodb.com/2018/02/nosql-performance-benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/). Please feel free to improve the various database test drivers. If you see any optimization I have missed, please issue a pull request.

The files are structured as follows:

`benchmark.js` contains the test driver and all the test cases. Currently, the following tests are implemented: `shortest`, `neighbors`, `neighbors2`, `singleRead`, `singleWrite`, and `aggregation`. Use `all` to run all tests inclusive warmup.
`benchmark.js` contains the test driver and all the test cases. Currently, the following tests are implemented: `shortest`, `hardPath`, `neighbors`, `neighbors2`, `neighbors2data`, `singleRead`, `singleWrite` and `aggregation`. Use `all` to run all tests inclusive warmup.

`arangodb`, `neo4j`, and `mongodb` are directories containing a single file `description.js`. This description file implements the database specific parts of the tests.
`arangodb`, `arangodb_mmfiles`, `neo4j`, `mongodb`, `orientdb`, `postgresql_jsonb` and `postgresql_tabular` are directories containing the files `description.js`, `setup.sh` and `import.sh`. The description file implements the database specific parts of the tests. The setup and import files are used to set up the database and import the needed dataset for the test.

`data` contains the test data used for the read and write tests and the start and end vertices for the shortest path.

## Installation

```
git clone https://github.com/weinberger/nosql-tests.git
npm install .
npm run data
```
### Client

The last step will uncompress the test data file.
We need additional services to install:

## Example
$ curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -
$ sudo apt-get install -y make build-essential nodejs

```
node benchmark arangodb -a 1.2.3.4 -t all
```
Clone the test repo and uncompress the test data files.

runs all tests against an ArangoDB server running on host 1.2.3.4.
$ git clone https://github.com/weinberger/nosql-tests.git
$ cd nosql-tests
$ npm install
$ npm run data

## Usage
### Server

```
node benchmark -h
Usage: benchmark <command> [options]
The server also needs the nosql-tests repo checked out. The folder on client and server are required to have the same path!

Commands:
arangodb ArangoDB benchmark
mongodb MongoDB benchmark
neo4j neo4j benchmark
$ git clone https://github.com/weinberger/nosql-tests.git

Options:
-t, --tests tests to run separated by comma: shortest, neighbors,
neighbors2, singleRead, singleWrite, aggregation
[string] [default: "all"]
-s, --restrict restrict to that many elements (0=no restriction)
[default: 0]
-l, --neighbors look at that many neighbors [default: 500]
-a, --address server host [string] [default: "127.0.0.1"]
-h Show help [boolean]
```
For the complete setup with all databases we need several additional services:

## Start Parameters
$ sudo apt-get install -y unzip default-jre binutils numactl collectd nodejs

To install all databases and import the test dataset:

We have used the following parameters to start the databases.
$ ./setupAll.sh

**ArangoDB**
## Run single test

```
./bin/arangod /mnt/data/arangodb/data-2.7 --server.threads 16 --scheduler.threads 8 --wal.sync-interval 1000 --config etc/relative/arangod.conf --javascript.v8-contexts 17
```
To run a single test against one database, we execute `benchmark.js` over node.

Admin interface: http://107.178.210.238:8529/
& node benchmark.js -h
Usage: benchmark.js <command> [options]

Commands:
arangodb ArangoDB benchmark
arangodb-mmfiles ArangoDB benchmark
mongodb MongoDB benchmark
neo4j neo4j benchmark
orientdb orientdb benchmark
postgresql postgresql JSON benchmark
postgresql_tabular postgresql tabular benchmark

**MongoDB**
Options:
--version Show version number [boolean]
-t, --tests tests to run separated by comma: shortest, neighbors,
neighbors2, neighbors2data, singleRead, singleWrite,
aggregation, hardPath, singleWriteSync
[string] [default: "all"]
-s, --restrict restrict to that many elements (0=no restriction)
[default: 0]
-l, --neighbors look at that many neighbors [default: 1000]
--ld, --neighbors2data look at that many neighbors2 with profiles
[default: 100]
-a, --address server host [string] [default: "127.0.0.1"]
-h Show help [boolean]

copyright 2018 Claudius Weinberger

```
./bin/mongod --storageEngine wiredTiger --syncdelay 1 --dbpath /mnt/data/mongodb/wired2/
```
## Run complete test setup

**OrientDB**
To run the complete test against every database, we simply execute `runAll.sh`.

```
./bin/server.sh -Xmx28G -Dstorage.wal.maxSize=28000
```
./runAll.sh <server-ip> <num-runs>

**Neo4J**

```
./bin/neo4j start
```

Admin interface: http://107.178.210.238:7474/
72 changes: 40 additions & 32 deletions arangodb/description.js
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
'use strict';

var Database = require('arangojs');
var arangojs = require('arangojs');
var opts = {
maxSockets: 25,
keepAlive: true,
Expand All @@ -12,12 +12,10 @@ module.exports = {
name: 'ArangoDB',

startup: function (host, cb) {
var db = new Database({
var db = new arangojs.Database({
url: 'http://' + host + ':8529',
agent: new Agent(opts),
fullDocument: false,
promisify: false,
promise: false
fullDocument: false
});

cb(db);
Expand All @@ -30,38 +28,50 @@ module.exports = {
module.exports.aggregate(db, coll, function (err, result) {
if (err) return cb(err);

console.log('INFO step 1/2 done');
console.log('INFO step 1/3 done');

module.exports.getCollection(db, 'relations', function (err, coll) {
module.exports.getCollection(db, 'relations', function (err, coll2) {
if (err) return cb(err);

module.exports.aggregate2(db, coll, function (err, result) {
db.route('_api/collection/relations/loadIndexesIntoMemory').put(function (err, result) {
if (err) return cb(err);

console.log('INFO step 2/2 done');
console.log('INFO warmup done');

return cb(null);
console.log('INFO step 2/3 done');

var warmupIds = require('../data/warmup1000');
var goal = 1000;
var total = 0;
for (var i = 0; i < goal; i++) {
module.exports.getDocument(db, coll, warmupIds[i], function (err, result) {
if (err) return cb(err);

++total;
if (total === goal) {
console.log('INFO step 3/3 done');
console.log('INFO warmup done');
return cb(null);
}
});
}
});
});
});
});
},

getCollection: function (db, name, cb) {
db.collection(name, cb);
cb(undefined, db.collection(name));
},

dropCollection: function (db, name, cb) {
db.dropCollection(name, cb);
db.collection(name).drop(cb);
},

createCollection: function (db, name, cb) {
db.createCollection(name, cb);
db.collection(name).create(cb);
},

createCollectionSync: function (db, name, cb) {
db.createCollection({name: name, waitForSync: true}, cb);
db.collection(name).create({waitForSync: true}, cb);
},

getDocument: function (db, coll, id, cb) {
Expand All @@ -80,66 +90,64 @@ module.exports = {
db.query('FOR x IN ' + coll.name + ' COLLECT age = x.AGE WITH COUNT INTO counter RETURN {age: age, amount: counter}', cb);
},

aggregate2: function (db, coll, cb) {
db.query('FOR x IN ' + coll.name + ' FILTER x._from > "" COLLECT a=1 WITH COUNT INTO counter RETURN {amount: counter}', cb);
aggregate2: function (db, coll, coll2, cb) {
db.query('LET tmp = (FOR y IN ' + coll.name + ' FOR x IN ' + coll2.name + ' FILTER x._from == CONCAT("' + coll.name + '", y._key) OR x._to == CONCAT("' + coll.name + '", y._key) COLLECT a=1 WITH COUNT INTO counter RETURN {amount: counter}) RETURN LENGTH(tmp)', cb);
},

neighbors: function (db, collP, collR, id, i, cb) {
db.query('RETURN NEIGHBORS(' + collP.name
+ ', ' + collR.name + ', @key, "outbound", [], {includeData:false})', {key: collP.name + '/' + id},
db.query('FOR v IN OUTBOUND @key ' + collR.name + ' OPTIONS {bfs: true, uniqueVertices: "global"} RETURN v._id',
{key: collP.name + '/' + id},
function (err, result) {
if (err) return cb(err);

result.all(function (err, v) {
if (err) return cb(err);

cb(null, v[0].length);
cb(null, v.length);
});
}
);
},

neighbors2: function (db, collP, collR, id, i, cb) {
db.query('RETURN NEIGHBORS(' + collP.name
+ ', ' + collR.name + ', @key, "outbound", [], {minDepth:0 , maxDepth: 2, includeData: false})', {key: collP.name + '/' + id},
db.query('FOR v IN 1..2 OUTBOUND @key ' + collR.name + ' OPTIONS {bfs: true, uniqueVertices: "global"} RETURN v._id',
{key: collP.name + '/' + id},
function (err, result) {
if (err) return cb(err);

result.all(function (err, v) {
if (err) return cb(err);

cb(null, v[0].length);
cb(null, v.length);
});
}
);
},

neighbors2data: function (db, collP, collR, id, i, cb) {
db.query('RETURN NEIGHBORS(' + collP.name + ', ' + collR.name + ', @key, "outbound", [], {minDepth:0 , maxDepth: 2, includeData: true})',
db.query('FOR v IN 1..2 OUTBOUND @key ' + collR.name + ' OPTIONS {bfs: true, uniqueVertices: "global"} RETURN v',
{key: collP.name + '/' + id},
function (err, result) {
if (err) return cb(err);

result.all(function (err, v) {
if (err) return cb(err);

cb(null, v[0].length);
cb(null, v.length);
});
}
);
},

shortestPath: function (db, collP, collR, path, i, cb) {
db.query('RETURN SHORTEST_PATH(' + collP.name + ', ' + collR.name
+ ', @from, @to, "outbound", {includeData: false})',
{from: 'profiles/' + path.from, to: 'profiles/' + path.to}, function (err, result) {
db.query('FOR v IN OUTBOUND SHORTEST_PATH @from TO @to ' + collR.name + ' RETURN v._id',
{from: 'profiles/' + path.from, to: 'profiles/' + path.to}, function (err, result) {
if (err) return cb(err);

result.all(function (err, v) {
if (err) return cb(err);

var p = v[0];
cb(null, (p === null) ? 0 : (p.vertices.length - 1));
cb(null, (v.length === 0) ? 0 : (v.length - 1));
});
}
);
Expand Down
78 changes: 37 additions & 41 deletions arangodb/import.sh
Original file line number Diff line number Diff line change
@@ -1,60 +1,56 @@
#!/bin/bash
set -e

ARANGODB=${1-.}
# Pass system or path to the source directory as first argument. If no argument
# is given the current directory will be assumed to be the source directory!
# The build directory MUST be as "build" in the source directory

echo "ARANGODB DIRECTORY: $ARANGODB"
echo "Usage [pokec.db] [path-to-arangodb] [path-to-benchmark]"
ARANGODB=${2-databases/arangodb}
DB=$ARANGODB/data/databases/${1-pokec}
BENCHMARK=${3-`pwd`}
TMP=/tmp/nosqlbenchmark
DOWNLOADS=$TMP/downloads

# import: POKEC Dataset from Stanford Snap
# https://snap.stanford.edu/data/soc-pokec-readme.txt
PROFILES_IN=$DOWNLOADS/soc-pokec-profiles.txt.gz
PROFILES_OUT=$DOWNLOADS/soc-pokec-profiles-arangodb.txt.gz

if [ ! -f soc-pokec-profiles.txt.gz ]; then
echo "Downloading PROFILES"
curl -OL https://snap.stanford.edu/data/soc-pokec-profiles.txt.gz
fi
RELATIONS_IN=$DOWNLOADS/soc-pokec-relationships.txt.gz
RELATIONS_OUT=$DOWNLOADS/soc-pokec-relationships-arangodb.txt.gz

if [ ! -f soc-pokec-relationships.txt.gz ]; then
echo "Downloading RELATIONS"
curl -OL https://snap.stanford.edu/data/soc-pokec-relationships.txt.gz
fi
echo "DATABASE: $DB"
echo "ARANGODB DIRECTORY: $ARANGODB"
echo "BENCHMARK DIRECTORY: $BENCHMARK"
echo "DOWNLOAD DIRECTORY: $DOWNLOADS"

if [ ! -f soc-pokec-profiles-arangodb.txt ]; then
$BENCHMARK/downloadData.sh

set -e

if [ ! -f $PROFILES_OUT ]; then
echo "Converting PROFILES"
echo '_key public completion_percentage gender region last_login registration AGE body I_am_working_in_field spoken_languages hobbies I_most_enjoy_good_food pets body_type my_eyesight eye_color hair_color hair_type completed_level_of_education favourite_color relation_to_smoking relation_to_alcohol sign_in_zodiac on_pokec_i_am_looking_for love_is_for_me relation_to_casual_sex my_partner_should_be marital_status children relation_to_children I_like_movies I_like_watching_movie I_like_music I_mostly_like_listening_to_music the_idea_of_good_evening I_like_specialties_from_kitchen fun I_am_going_to_concerts my_active_sports my_passive_sports profession I_like_books life_style music cars politics relationships art_culture hobbies_interests science_technologies computers_internet education sport movies travelling health companies_brands more' > soc-pokec-profiles-arangodb.txt
gunzip < soc-pokec-profiles.txt.gz | sed -e 's/null//g' -e 's~^~P~' -e 's~ $~~' >> soc-pokec-profiles-arangodb.txt
echo '_key public completion_percentage gender region last_login registration AGE body I_am_working_in_field spoken_languages hobbies I_most_enjoy_good_food pets body_type my_eyesight eye_color hair_color hair_type completed_level_of_education favourite_color relation_to_smoking relation_to_alcohol sign_in_zodiac on_pokec_i_am_looking_for love_is_for_me relation_to_casual_sex my_partner_should_be marital_status children relation_to_children I_like_movies I_like_watching_movie I_like_music I_mostly_like_listening_to_music the_idea_of_good_evening I_like_specialties_from_kitchen fun I_am_going_to_concerts my_active_sports my_passive_sports profession I_like_books life_style music cars politics relationships art_culture hobbies_interests science_technologies computers_internet education sport movies travelling health companies_brands more' > $PROFILES_OUT
gunzip < $PROFILES_IN | sed -e 's/null//g' -e 's~^~P~' >> $PROFILES_OUT
fi

if [ ! -f soc-pokec-relationships-arangodb.txt ]; then
if [ ! -f $RELATIONS_OUT ]; then
echo "Converting RELATIONS"
echo '_from _to' > soc-pokec-relationships-arangodb.txt
gzip -dc soc-pokec-relationships.txt.gz | awk -F"\t" '{print "profiles/P" $1 "\tprofiles/P" $2}' >> soc-pokec-relationships-arangodb.txt
echo '_from _to' > $RELATIONS_OUT
gzip -dc $RELATIONS_IN | awk -F"\t" '{print "profiles/P" $1 "\tprofiles/P" $2}' >> $RELATIONS_OUT
fi

INPUT_PROFILES=`pwd`/soc-pokec-profiles-arangodb.txt
INPUT_RELATIONS=`pwd`/soc-pokec-relationships-arangodb.txt

if [ "$ARANGODB" == "system" ]; then
ARANGOSH=/usr/bin/arangosh
ARANGOSH_CONF=/etc/arangodb/arangosh.conf
ARANGOIMP=/usr/bin/arangoimp
ARANGOIMP_CONF=/etc/arangodb/arangoimp.conf
APATH=.
else
ARANGOSH=./bin/arangosh
ARANGOSH_CONF=./etc/relative/arangosh.conf
ARANGOIMP=./bin/arangoimp
ARANGOIMP_CONF=./etc/relative/arangoimp.conf
APATH=$ARANGODB
fi
ARANGOSH=$ARANGODB/usr/bin/arangosh
ARANGOSH_CONF=$ARANGODB/etc/arangodb3/arangosh.conf
ARANGOIMP=$ARANGODB/usr/bin/arangoimp
ARANGOIMP_CONF=$ARANGODB/etc/arangodb3/arangoimp.conf
APATH="$ARANGODB"

(
cd $APATH

cd "$APATH" || { echo "failed to change into ${APATH}" ; exit 1; }
$ARANGOSH -c $ARANGOSH_CONF << 'EOF'
var db = require("org/arangodb").db;
var db = require("@arangodb").db;
db._create("profiles");
db._createEdgeCollection("relations");
db._createEdgeCollection("relations", {keyOptions: { type: "autoincrement", offset: 0 } })
EOF
$ARANGOIMP -c $ARANGOIMP_CONF --type tsv --collection profiles --file $INPUT_PROFILES
$ARANGOIMP -c $ARANGOIMP_CONF --type tsv --collection relations --file $INPUT_RELATIONS
$ARANGOIMP -c $ARANGOIMP_CONF --server.authentication false --type tsv --collection profiles --file $PROFILES_OUT --threads 8
$ARANGOIMP -c $ARANGOIMP_CONF --server.authentication false --type tsv --collection relations --file $RELATIONS_OUT --threads 8
)
Loading

0 comments on commit 19cb07b

Please sign in to comment.