Skip to content

Latest commit

 

History

History
211 lines (176 loc) · 9.42 KB

README.md

File metadata and controls

211 lines (176 loc) · 9.42 KB

Tornado Webhook Collector (executable)

The Webhook Collector is a standalone HTTP server that listens for REST calls from a generic webhook, generates Tornado Events from the webhook JSON body, and sends them to the Tornado Engine.

How It Works

The webhook collector executable is an HTTP server built on actix-web.

On startup, it creates a dedicated REST endpoint for each configured webhook. Calls received by an endpoint are processed by the embedded jmespath collector that uses them to produce Tornado Events. In the final step, the Events are forwarded to the Tornado Engine through the configured connection type.

For each webhook, you must provide three values in order to successfully create an endpoint:

  • id: The webhook identifier. This will determine the path of the endpoint; it must be unique per webhook.
  • token: A security token that the webhook issuer has to include in the URL as part of the query string (see the example at the bottom of this page for details). If the token provided by the issuer is missing or does not match the one owned by the collector, then the call will be rejected and an HTTP 401 code (UNAUTHORIZED) will be returned.
  • collector_config: The transformation logic that converts a webhook JSON object into a Tornado Event. It consists of a JMESPath collector configuration as described in its specific documentation.

Configuration

The executable configuration is based partially on configuration files, and partially on command line parameters.

The available startup parameters are:

  • config-dir: The filesystem folder from which the collector configuration is read. The default path is /etc/tornado_webhook_collector/.
  • webhooks-dir: The folder where the Webhook configurations are saved in JSON format; this folder is relative to the config_dir. The default value is /webhooks/.

In addition to these parameters, the following configuration entries are available in the file 'config-dir'/webhook_collector.toml:

  • logger:
    • level: The Logger level; valid values are trace, debug, info, warn, and error.
    • stdout: Determines whether the Logger should print to standard output. Valid values are true and false.
    • file_output_path: A file path in the file system; if provided, the Logger will append any output to it.
  • webhook_collector:
    • tornado_event_socket_ip: The IP address where outgoing events will be written. This should be the address where the Tornado Engine listens for incoming events. If present, this value overrides what specified by the tornado_connection_channel entry. This entry is deprecated and will be removed in the next release of tornado. Please, use the tornado_connection_channel instead.
    • tornado_event_socket_port: The port where outgoing events will be written. This should be the port where the Tornado Engine listens for incoming events. This entry is mandatory if tornado_connection_channel is set to TCP. If present, this value overrides what specified by the tornado_connection_channel entry. This entry is deprecated and will be removed in the next release of tornado. Please, use the tornado_connection_channel instead.
    • message_queue_size: The in-memory buffer size for Events. It makes the application resilient to errors or temporary unavailability of the Tornado connection channel. When the connection on the channel is restored, all messages in the buffer will be sent. When the buffer is full, the collector will start discarding older messages first.
    • server_bind_address: The IP to bind the HTTP server to.
    • server_port: The port to be used by the HTTP Server.
    • tornado_connection_channel: The channel to send events to Tornado. It contains the set of entries required to configure a Nats or a TCP connection. Beware that this entry will be taken into account only if tornado_event_socket_ip and tornado_event_socket_port are not provided.
      • In case of connection using Nats, these entries are mandatory:
        • nats.client.addresses: The addresses of the NATS server.
        • nats.client.auth.type: The type of authentication used to authenticate to NATS (Optional. Valid values are None and Tls. Defaults to None if not provided).
        • nats.client.auth.certificate_path: The path to the client certificate (in .pem format) that will be used for authenticating to NATS. (Mandatory if nats.client.auth.type is set to Tls).
        • nats.client.auth.private_key_path: The path to the client certificate private key (in .pem format) that will be used for authenticating to NATS.
        • nats.client.auth.path_to_root_certificate: The path to a root certificate (in .pem format) to trust in addition to system's trust root. May be useful if the NATS server is not trusted by the system as default. (Optional, valid if nats.client.auth.type is set to Tls).
        • nats.subject: The NATS Subject where tornado will subscribe and listen for incoming events.
      • In case of connection using TCP, these entries are mandatory:
        • tcp_socket_ip: The IP address where outgoing events will be written. This should be the address where the Tornado Engine listens for incoming events.
        • tcp_socket_port: The port where outgoing events will be written. This should be the port where the Tornado Engine listens for incoming events.

More information about the logger configuration is available here.

The default config-dir value can be customized at build time by specifying the environment variable TORNADO_WEBHOOK_COLLECTOR_CONFIG_DIR_DEFAULT. For example, this will build an executable that uses /my/custom/path as the default value:

TORNADO_WEBHOOK_COLLECTOR_CONFIG_DIR_DEFAULT=/my/custom/path cargo build 

An example of a full startup command is:

./tornado_webhook_collector \
      --config-dir=/tornado-webhook-collector/config

In this example the Webhook Collector starts up and then reads the configuration from the /tornado-webhook-collector/config directory.

Webhooks Configuration

As described before, the two startup parameters config-dir and webhooks-dir determine the path to the Webhook configurations, and each webhook is configured by providing id, token and collector_config.

As an example, consider how to configure a webhook for a repository hosted on Github.

If we start the application using the command line provided in the previous section, the webhook configuration files should be located in the /tornado-webhook-collector/config/webhooks directory. Each configuration is saved in a separate file in that directory in JSON format (the order shown in the directory is not necessarily the order in which the hooks are processed):

/tornado-webhook-collector/config/webhooks
                 |- github.json
                 |- bitbucket_first_repository.json
                 |- bitbucket_second_repository.json
                 |- ...

An example of valid content for a Webhook configuration JSON file is:

{
  "id": "github_repository",
  "token": "secret_token",
  "collector_config": {
    "event_type": "${commits[0].committer.name}",
    "payload": {
      "source": "github",
      "ref": "${ref}",
      "repository_name": "${repository.name}"
    }
  }
}

This configuration assumes that this endpoint has been created:

http(s)://collector_ip:collector_port/event/github_repository

However, the Github webhook issuer must pass the token at each call. Consequently, the actual URL to be called will have this structure:

http(s)://collector_ip:collector_port/event/github_repository?token=secret_token

Security warning: Since the security token is present in the query string, it is extremely important that the webhook collector is always deployed with HTTPS in production. Otherwise, the token will be sent unencrypted along with the entire URL.

Consequently, if the public IP of the collector is, for example, 35.35.35.35 and the server port is 1234, in Github, the webhook settings page should look like this:

github_webhook_settings

Finally, the collector_config configuration entry determines the content of the tornado Event associated with each webhook input.

So for example, if Github sends this JSON (only the relevant parts shown here):

{
  "ref": "refs/heads/master",
  ...
  "commits": [
    {
      "id": "33ad3a6df86748011ee8d5cef13d206322abc68e",
      ...
      "committer": {
        "name": "GitHub",
        "email": "[email protected]",
        "username": "web-flow"
      }
    }
  ],
  ...
  "repository": {
    "id": 123456789,
    "name": "webhook-test",
    ...
  }
}

then the resulting Event will be:

{
  "type": "GitHub",
  "created_ms": 1554130814854,
  "payload": {
    "source": "github",
    "ref": "refs/heads/master",
    "repository_name": "webhook-test"
  }
}

The Event creation logic is handled internally by the JMESPath collector, a detailed description of which is available in its specific documentation.