- cmake 2.8.7 or later
- libgearman(-devel)
- log4cxx(-devel)
- libcurl(-devel)
- boost(-devel) 1.48 or later
- gcc 4.8 or later
- prometheus-cpp 0.9.0 or later
$ cmake .
$ make
$ sudo make install
driveshaft takes several arguments:
$ driveshaft
Allowed options:
--help produce help message
--user arg username to run as (OPTIONAL)
--pid_file arg file to write process ID out to (OPTIONAL)
--daemonize Daemon, detach and run in the background (OPTIONAL)
--jobsconfig arg jobs config file path
--logconfig arg log config file path
--max_running_time arg how long can a job run before it is considered failed
(in seconds)
--loop_timeout arg how long to wait for a response from gearmand before
restarting event-loop (in seconds)
--exporter_addr arg (=0.0.0.0:8888) the address:port on which to launch a
prometheus exporter to publish metrics
A simple jobsconfig file looks like this:
{
"gearman_servers_list":
[
"localhost"
],
"pools_list":
{
"ShopStats":
{
"job_processing_uri": "http://localhost/job.php",
"worker_count": 0,
"jobs_list":
[
"ShopStats"
]
},
"Newsfeed":
{
"job_processing_uri": "http://localhost/job.php",
"worker_count": 0,
"jobs_list":
[
"Newsfeed"
]
},
"Regular":
{
"job_processing_uri": "http://localhost/job.php",
"worker_count": 0,
"jobs_list":
[
"Sum3",
"Sum",
"Sum2"
]
}
}
}
gearman_servers_list
- addresses of gearmand serverspools list
- a list of named pools and corresponding configuration for every pool:worker_count
- Number of workers to reserve for jobs in this pooljobs_list
- Names of jobs that should be ran on the workers in this pooljob_processing_uri
- the uri to send the job payload to for execution
An example log config is included in the repository. For more information, see the log4cxx documentation.
Expressed in seconds, this is how long to wait for a job from gearmand before restarting
the event loop. It is passed in to gearman_worker_set_timeout
. This also influences the
shutdown wait durations (hard shutdown is 2x, and graceful is 4x this value).
The server runs a prometheus exporter interface over http at the address and port specified by the exporter_addr
command line option. The following metrics are exposed:
- histogram
driveshaft_job_duration
: labelled bypool
andfunction
. Aggregates the duration of successful jobs. - counter
driveshaft_http_errors
: labelled bypool
,function
andhttp_status
- counter
driveshaft_timeouts
: labelled bypool
andfunction
- counter
driveshaft_errors
: labelled bypool
andfunction
, includes errors also counted indriveshaft_http_errors
anddriveshaft_timeouts
as well as any other errors. - counter
driveshaft_threads
: labelled bystatus
={idle, busy}
,pool
andfunction
. Idle threads do not include thefunction
label.
- Jobs are grouped into pools and every pool has a
worker_count
setting in order to define the maximum concurrency. - For every pool, it registers the jobs in
jobs_list
with gearmand and maintainsworker_count
threads with persistent connections to fetch jobs and submit back results. - When the config changes, it signals the appropriate pool threads to die. Any
currently running job on that thread has a
max_running_time
seconds window to finish, otherwise the job is considered failed, gearmand is updated and the thread is closed. Note that the job may keep on running on the HTTP endpoint. New pool threads are created as needed to match the configuration. - Jobs are run via the HTTP endpoint
job_processing_uri
defined in the config. The endpoint will receive the class name and all the args and will have to do the right thing and return SUCCESS/FAILURE along with any response text. The thread that is processing the job blocks waiting for a response.
By reusing connections and not re-registering with gearmand on every job completion, Driveshaft saves gearmand a lot of work that impacts enqueue latency.
And by using an HTTP endpoint to actually do the heavy lifting, we get the benefits of a clean-sandbox and Opcache (and can even use HHVM!).
Driveshaft will send the job name and arguments via a POST request. Example:
{
"function_name": "Sum",
"job_handle": "H:localhost:6",
"unique": "57a7b604-659a-11e5-9442-04013e647701",
"workload": "[1,2]"
}
The endpoint should respond with a JSON payload in the body of the document.
Success is indicated by returning zero for gearman_ret
. The response string
must be a string. Non-strings will not be accepted. Example:
{
"gearman_ret": 0,
"response_string" => "3"
}
See the Contributing Guide