forked from grafana/tempo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Metrics generator] Initial implementation (grafana#1282)
* Add metrics-generator component * Add some initial logic to make the generator module optional * Rename tempo_distributor_metrics_generator_clients metric * Periodical check in: add remote-write setup * jsonnet: rename generator -> metrics_generator * Periodical check in: add instance, copy servicegraphprocessor over, add scrape endpoint * Fix make test * Support spanmetrics via OTel collector Generator is now able to run an embedded OTel collector. This allows to run any number of processors, which includes spanmetrics. This collector uses a custom metrics exporter and a remote write client. Metrics generated by the spanmetrics processor are converted to Prometheus and appended to the remote write client's WAL, which will export the metric data after commit. * Implement internal spanmetrics processor * Add latency metrics * Only loop one for latency metrics * Improve how metrics are collected, some bug fixes Changes: - Remote write was constantly failing with “out of order samples”. Fix: apparently the timestamp has to be in milliseconds, not in seconds. There is not a lot of documentation on the Prometheus interfaces, but the cortex proto has a field timestampMs. - After this remote write was failing with label name \"le\" is not unique: invalid sample. Fix: there was a bug in the append logic of the labels array. Since you were reusing the labels variable in a for-loop, the le label was added multiple times. Initially I thought pushing metrics on every push request was causing the out-of-order issues, so I refactored the processor a bit to have both a PushSpans and a CollectMetrics method. CollectMetrics is called every 15s with a storage.Appender. This way we can also use a single appender for multiple processors, This wasn’t necessary to get it working, but I think this is the logic we want to end up with (i.e. don’t send samples too often to keep DPM low). * make vendor-check * Add spanmetrics_test.go * jsonnet: add metrics-generator service * Add distributor.enable_metrics_generator_ring; some bug fixes * Remove unused Registerer and scrape endpoint code; sprinkle some more TODO's around * Add crude mechanism to trim active series in span metrics processor * span metrics: add delete_after_last_update * Admit config parameters in spanmetrics processor Histogram buckets and additional dimensions can be configured * Initial implementation of service graphs processor * Hook up service graph processor Store - make sure tests pass - fix copy-paste error in shouldEvictHead Service graphs - ensure metric names align with agent - server latencies were overwriting client latencies - expose operational metrics * Rename metrics from generator_processor_... to metrics_generator_processor_... * Set up BasicLifecycler, load balance spans across metrics-generator * Replace processor.CollectMetrics with a custom prometheus.Registerer Changes: Service graphs processor - replaced code to manually build metrics with CounterVec and HistogramVec - aligned metric names with the agent - commented out code around collectCh, this was leaking memory Span metrics processor - replaced code to manually build metrics with CounterVec and HistogramVec Introduced Registry to bridge prometheus.Registerer and storage.Appender Add support for configuring external labels Fix generator_test.go * make vendor-check * Refactor remote-write structures, add tests * Add add_instance_id_label config * Split write requests up to a max msg size Builds write requests up to a max size and sends them sequentially. * Service graphs: collect edges when completed When an edge is completed, it's sent to be collected by a number of goroutines that call collectEdge. Edges can also be collected when expired. * Store: interface: give paramaters names * Service graphs: Close closech * Simplify evict method name * Move remote_write to its own method * Add metrics_generator_processors to overrides, dynamically create/remove processors * Add tests for split requests in remote write appender * Refactor distributor code a bit * Make collection interval configurable * Add concurrency test for instance * Minor tweaks * Add metrics to track failed active processors updates * Tweak remote write metrics * Add exemplars support * Fix latency measurement in spanmetrics processor * Fix typo * Change metrics generator enabled config var * Return ready when generator is registered in the ring * Add read-only mode during generator shutdown * Configure docker-compose distributed to work with metrics generator * Enable pushing bytes to metrics-generator * Remove cortex from distributed example * Replace cortexpb -> prompb, move all remote-write stuff into remotewrite package * Set context timeout when collecting metrics, remove hola * Protect readOnly from concurrency * Update vendor * Rename metrics prefix span_metrics to spanmetrics This matches the naming from OTel spanmetrics processor * Always add le="+Inf" bucket when not set already * Fix lint errors * Fix compactor test Test was writing tempodb.Trace to the WAL, instead of tempodb.TraceBytes * Move expire goroutine to collector workers * Move metrics_generator_enabled to the top-level config * Update docker-compose distributor example * Use snappy from klauspost/compress * Fix distributor test after moving metrics_generator_enabled flag * Reorganize how processors are updated, simplify use of locks * Stub time in processor tests * Fix how label slices are handled in registry * jsonnet: disable metrics-generator by default * Regenerate kube manifests * Pin time.Now in test with registry * Update CHANGELOG.md Co-authored-by: Koenraad Verheyden <[email protected]>
- Loading branch information
Showing
200 changed files
with
37,955 additions
and
133 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,3 +16,4 @@ scrape_configs: | |
- 'ingester-2:3200' | ||
- 'querier:3200' | ||
- 'query-frontend:3200' | ||
- 'metrics-generator:3200' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.