merge changes with buffering update

influxdata · May 12, 2016 · 2d93b6c · 2d93b6c
2 parents 33ac3e0 + 7ee09ca
commit 2d93b6c
Show file tree

Hide file tree

Showing 5 changed files with 353 additions and 207 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,8 @@
 # InfluxDB Relay
 
-This project adds a basic high availability layer to InfluxDB. With the right architecture and disaster recovery processes, this achieves a highly available setup. It will be available with the 0.12.0 release of InfluxDB in April 2016.
+This project adds a basic high availability layer to InfluxDB. With the right architecture and disaster recovery processes, this achieves a highly available setup.
+
+*NOTE:* `influxdb-relay` must be built with Go 1.5+
 
 ## Usage
 
@@ -98,9 +100,32 @@ The setup should look like this:
  ```
 
 
-The relay will listen for HTTP or UDP writes and write the data to both servers via their HTTP write endpoint. If the write is sent via HTTP, the relay will return a success response as soon as one of the two InfluxDB servers returns a success. If either InfluxDB server returns a 400 response, that will be returned to the client immediately. If both servers return a 500, a 500 will be returned to the client.
+The relay will listen for HTTP or UDP writes and write the data to both servers via their HTTP write endpoint. If the write is sent via HTTP, the relay will return a success response as soon as any of the InfluxDB servers returns a success. If one of the InfluxDB servers returns a 4xx response, that will be returned to the client immediately. If all servers return a 5xx response, the first one received by the relay will be returned to the client, unless buffering is enabled.
+
+With this setup a failure of one Relay or one InfluxDB can be sustained while still taking writes and serving queries. However, the recovery process might require operator intervention.
+
+## Buffering
+
+The relay can be configured to buffer failed requests for HTTP backends.
+The intent of this logic is reduce the number of failures during short outages or periodic network issues.
+> This retry logic is **NOT** sufficient for for long periods of downtime as all data is buffered in RAM
+
+Buffering has the following configuration options (configured per HTTP backend):
+
+* buffer-size-mb -- An upper limit on how much point data to keep in memory (in MB)
+* max-batch-kb -- A maximum size on the aggregated batches that will be submitted (in KB)
+* max-delay-interval -- the max delay between retry attempts per backend.
+    The initial retry delay is 500ms and is doubled after every failure.
+
+If the buffer is full then requests are dropped and an error is logged.
+If a requests makes it into the buffer it is retried until success.
+
+Retries are serialized to a single backend. In addition, writes will be aggregated and batched as long as the body of the request will be less than `max-batch-kb`
+If buffered requests succeed then there is no delay between subsequent attempts.
+
+If the relay stays alive the entire duration of a downed backend server without filling that server's allocated buffer, and the relay can stay online until the entire buffer is flushed, it would mean that no operator intervention would be required to "recover" the data. The data will simply be batched together and written out to the recovered server in the order it was received.
 
-With this setup a failure of one Relay or one InfluxDB can be sustained while still taking writes and serving queries. However, the recovery process will require operator intervention.
+*NOTE*: The limits for buffering are not hard limits on the memory usage of the application, and there will be additional overhead that would be much more challenging to account for. The limits listed are just for the amount of point line protocol (including any added timestamps, if applicable). Factors such as small incoming batch sizes and a smaller max batch size will increase the overhead in the buffer. There is also the general application memory overhead to account for. This means that a machine with 2GB of memory should not have buffers that sum up to _almost_ 2GB.
 
 ## Recovery
 
@@ -121,23 +146,15 @@ During this entire process the Relays should be sending writes to both servers f
 
 It's possible to add another layer on top of this kind of setup to shard data. Depending on your needs you could shard on the measurement name or a specific tag like `customer_id`. The sharding layer would have to service both queries and writes.
 
-## Buffering
-
-The relay can be configured to buffer failed requests.
-The intent of this logic is reduce the number of failures during short outages or periodic network issues.
-> This retry logic is **NOT** sufficient for for long periods of downtime as all data is buffered in RAM
-
-Buffering has two configuration options:
-
-* BufferSize -- the maximum number of requests to buffer per backend.
-* MaxDelayInterval -- the max delay between retry attempts per backend.
-    The initial retry interval is 500ms and is doubled after every failure.
+As this relay does not handle queries, it will not implement any sharding logic. Any sharding would have to be done externally to the relay.
 
-If the buffer is full then requests are dropped and an error is logged.
-If a requests makes it into the buffer it is retried until success.
 
-Retries are serialized to a single backend.
-Meaning that buffered requests are attempted one at a time.
-If buffered requests succeed then there is no delay between subsequent attempts.
+## Caveats
 
+While `influxdb-relay` does provide some level of high availability, there are a few scenarios that need to be accounted for:
 
+- `influxdb-relay` will not relay the `/query` endpoint, and this includes schema modification (create database, `DROP`s, etc). This means that databases must be created before points are written to the backends.
+- Continuous queries will still only write their results locally. If a server goes down, the continuous query will have to be backfilled after the data has been recovered for that instance.
+- Overwriting points is potentially unpredictable. For example, given servers A and B, if B is down, and point X is written (we'll call the value X1) just before B comes back online, that write is queued behind every other write that occurred while B was offline. Once B is back online, the first buffered write succeeds, and all new writes are now allowed to pass-through. At this point (before X1 is written to B), X is written again (with value X2 this time) to both A and B. When the relay reaches the end of B's buffered writes, it will write X (with value X1) to B... At this point A now has X2, but B has X1.
+  - It is probably best to avoid re-writing points (if possible). Otherwise, please be aware that overwriting the same field for a given point can lead to data differences.
+  - This could potentially be mitigated by waiting for the buffer to flush before opening writes back up to being passed-through.
diff --git a/relay/config.go b/relay/config.go
@@ -33,8 +33,11 @@ type HTTPOutputConfig struct {
 	// The format used is the same seen in time.ParseDuration
 	Timeout string `toml:"timeout"`
 
-	// Buffer failed writes up to maximum count.
-	BufferSize int `toml:"buffer-size"`
+	// Buffer failed writes up to maximum count. (Default 0, retry/buffering disabled)
+	BufferSizeMB int `toml:"buffer-size-mb"`
+
+	// Maximum batch size in KB (Default 512)
+	MaxBatchKB int `toml:"max-batch-kb"`
 
 	// Maximum delay between retry attempts.
 	// The format used is the same seen in time.ParseDuration (Default 10s)