-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be explicit about scraping, streaming and push #41
Comments
I think the spec should be completely agnostic of the source method. |
I don't think we get to be agnostic, but we should cover both push and pull. |
I would like to have it decoupled if at all possible; but we will find out either way. The result needs to support both, and ideally show a way to transfer one into the other. |
I strongly support pushing and pulling and we should be explicit on that in the API spec. |
cc @dashpole Standards around the actual transport, at least in the context of pull and streaming, would be nice for making it easy on the developer end to use a library and get compat with any pull- or streaming-compatible OpenMetrics storage/TSDB/etc. For instance, on-and-off over the past few months, @dashpole and I have been looking at replacing the bespoke metrics API that the Kubelet serves in Kubernetes (part of a broader effort to overhaul the internal Kubelet --> in-cluster storage --> Kubernetes controller metrics pipeline). Ideally, whatever we decide on would also be easily consumable by other monitoring pipelines as well, which makes OpenMetrics a good candidate :-). We want streaming support for metrics collection at rapid intervals, and wanted to see if anyone else had any experience/feedback/thoughts from the OpenMetrics side. Long-term, it would be awesome if whatever we proposed/adopted for Kubernetes ended up being/converging with a standard for streaming OpenMetrics, so it "just worked" with other solutions too. |
@DirectXMan12 "Rapid" is relative, it would help to talk about specifics. Specific intervals, how many metrics per interval, etc. |
We have had requests for intervals as low as 100ms. Given a 100 pods/node limit, 2 metrics/container, and assuming 2 containers/pod, we would to ideally be able to push ~400 metrics in this time interval from each node. This is all in an ideal world, so take it with a grain of salt. |
We've done a lot of work to make the Go implementation low impact, especially when it comes to the cost of instrumentation (15 CPU nanoseconds to increment a counter, for example). But we haven't done as much optimization of the collection, as we mostly target 15s intervals. With the current Go implementation, I grabbed a typical typical app. This app exposes ~4700 metrics in ~35ms. This yields 7.8us per metric. So for your example, it would take ~3.1ms to gather 400 metrics. This seems reasonable to me for a 100ms interval. There has been some recent work to cut the scrape cost, I'm not sure if my above example includes that work, I'll have to do some more digging. In an ideal world, each container/pod would have a separate metrics endpoint, exposing data directly like my above example does. This means that if an application has 100 metrics itself, it can be scraped in under 1ms, more than good enough for 100ms intervals. Pollers (like Prometheus) can then parallelize the work of data collection. |
Please note that push/streaming will probably look the same on the wire, but be called by a different name to avoid confusion. |
We also have in-cluster storage for these metrics, which is currently the metrics-server. It would be expected to ingest ~400 metrics / 100ms from up to 5000 nodes. In its current state (scraping a json http endpoint every 30 seconds), nearly all of its CPU time is spent serializing/deserializing, and are hoping to move to a more efficient format. |
This is confusing, are you saying 4000 samples/second? Or 20M samples/second? Prometheus used to support json, but was replaced due to CPU overhead back in 2013. With the current format we can ingest ~200,000 samples/second/cpu. |
NB: ITYM samples/(second*core) (hard to write without latex). |
That's not necessarily true for monitoring system-level metrics around the container runtimes -- it generally makes more sense there to have the container-runtime do the monitoring and report those aspects together from a single endpoint.
Exactly our problem :-)
given:
we want |
Right, I'm talking about the applications in the containers. It is not a good idea to have the container runtime deal with these, as you can have thousands of metrics per container. The applications themselves already have metrics endpoints declared. Adding this to the container runtime is also going to create a SPoF/bottleneck. This is exactly why Borgmon and Prometheus work the way the do. If you want to improve cluster monitoring efficiency, why not contribute directly to the Prometheus project? |
Sure, agreed, for application-specific metrics. Those are exposed by the application, and are scraped directly by Prometheus or what-have-you. That wasn't what the example provided above was about. In Kubernetes, there are also system-level metrics determined by inspecting cgroups from outside the containers, etc, which the apps don't know about. Those are computed by the container runtimes or the kubelet (depending on the given metric), and exposed via the kubelet or by the container runtime directly. The applications have no idea about how to monitor those metrics, and they shouldn't have to. The example we were talking about was those metrics. |
In #11 we mention scraping but I can imagine that someone might want to use the exposition format for streaming (or pushing) metrics.
We should consider such use cases, their implications and whether the specification should allow for them.
The text was updated successfully, but these errors were encountered: