Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial introduction of AWS CloudWatch Metrics and Log sources #238

Merged
merged 20 commits into from
Dec 22, 2020

Conversation

cab105
Copy link
Contributor

@cab105 cab105 commented Dec 17, 2020

Closes #232

Provide two new knative event sources for AWS:

  • awscloudwatchsource to obtain AWS metrics data based on a user-defined query
  • awscloudwatchlogsource to obtain the AWS logs from a pre-existing logGroup and optional logStream

Both introduce the concept of a polling interval which is how frequently AWS should be polled to obtain updated data. Note that if the polling is frequently enough, then the user may get billed, which is why the defaults are configured to ensure the requests remain within the free tier.

AWSCloudWatchSource

Unlike the other AWS sources, an ARN cannot be used to denote the metrics being sought. Instead the AWS Region must be specified. The metrics query can denote a specific metric collection that is named or expression to apply between two named metrics.

In addition, two response types are created: com.amazon.metrics.metric for the metric values, and com.amazon.metrics.message if there's a message response from the query.

AWSCloudWatchLogSource

The ARN for retrieving logs must always have a log group associated with the entry. There is an optional log-stream attribute that can be included in the ARN resource to indicate a specific steam. If no stream is specified, or if it is *, then the source will iterate through all streams associated with the logGroup, and create events for each stream that has had data written to it within the pollingFrequency interval.

@cab105 cab105 requested a review from antoineco December 17, 2020 06:10
@cab105 cab105 self-assigned this Dec 17, 2020
Copy link
Contributor

@antoineco antoineco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a round of copy-paste police before I dig deeper :)

config/samples/awscloudwatchlog-containersource.yaml Outdated Show resolved Hide resolved
config/samples/awscloudwatch-containersource.yaml Outdated Show resolved Hide resolved
config/samples/awscloudwatch-sinkbinding.yaml Outdated Show resolved Hide resolved
config/samples/awscloudwatch-sinkbinding.yaml Outdated Show resolved Hide resolved
config/samples/awscloudwatchlog-sinkbinding.yaml Outdated Show resolved Hide resolved
pkg/apis/sources/register.go Outdated Show resolved Hide resolved
pkg/apis/sources/register.go Show resolved Hide resolved
pkg/apis/sources/v1alpha1/awscloudwatch_types.go Outdated Show resolved Hide resolved
pkg/apis/sources/v1alpha1/awscloudwatchlog_types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@antoineco antoineco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few fundamental changes required regarding inconsistent CloudEvent attributes. I know it's a pain to ensure the controller and adapter are using the same values, but it's the key to having integrations that work together and are connected using an event registry.

If a source declares its types as my.type.1 and my.type.2 in the registry, I should have a 100% guarantee that those exact types will be emitted. Likewise, if a source declares itself as io.triggermesh.logs.chis.my-source in the registry, I should find that exact reference in every CloudEvent. That's what the GetEventTypes() and AsEventSource() helpers are for.

pkg/apis/sources/v1alpha1/awscloudwatch_lifecycle.go Outdated Show resolved Hide resolved
pkg/apis/sources/v1alpha1/awscloudwatch_lifecycle.go Outdated Show resolved Hide resolved
pkg/apis/sources/v1alpha1/awscloudwatch_lifecycle.go Outdated Show resolved Hide resolved
pkg/adapter/awscloudwatchlogsource/adapter.go Outdated Show resolved Hide resolved
pkg/adapter/awscloudwatchlogsource/adapter.go Outdated Show resolved Hide resolved
pkg/adapter/awscloudwatchlogsource/adapter.go Outdated Show resolved Hide resolved
pkg/adapter/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
pkg/adapter/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
event := cloudevents.NewEvent(cloudevents.VersionV1)
event.SetType(v1alpha1.AWSEventType(metricEventType, "message"))
event.SetSource(name + "-" + strconv.Itoa(i))
event.SetID(id.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't pin the ID ourselves. Each combination of type+source+id must be unique to statisfy the CE spec. If your intention was to indicate that things are connected to a page, we should use a CE extension (iotriggermeshlogpage?), but does the page matter?

pkg/adapter/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
pkg/reconciler/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
pkg/reconciler/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
pkg/reconciler/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
Comment on lines 49 to 50
// Context for logging
context context.Context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's just for logging, pass the logger directly. However, I'm not a big fan of having something that's not config in a struct called adapterConfig. For most cases you should be fine discarding serialization errors below, it is unlikely to happen because you're serializing something that's strongly typed.

If we need stronger validation, the proper way to report errors is by introducing a validation webhook IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in other words, if relying on the k8s to validate based on the OpenAPI spec, then trust it.

Copy link
Contributor

@antoineco antoineco Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even without validation at the API, when you reach this code, the K8S object's spec is a strongly typed Go struct that was already deserialized at some point (the client sent it as a YAML or JSON, eventually). Unless it contains functions, a struct can always be serialized to JSON without a custom serializer, so whatever the struct contains, this code will yield some JSON (you can try to break Marshal() in the Go playground by passing weird structs, I don't think it will be simple or even doable).

If an error occurs due to some really odd error, I think the right thing to do is to panic for now, and add proper error handling to this function signature later (but then we need to update all sources).

pkg/reconciler/awscloudwatchlogsource/adapter.go Outdated Show resolved Hide resolved
pkg/reconciler/awscloudwatchlogsource/adapter.go Outdated Show resolved Hide resolved
pkg/reconciler/awscloudwatchsource/adapter.go Outdated Show resolved Hide resolved
@antoineco
Copy link
Contributor

antoineco commented Dec 17, 2020

@cab105 a general observation that didn't fit in the rest of my comments: the service is called CloudWatch Logs (plural), but I see cloudwatchlog (singular) scattered across the code, including the public facing Kubernetes types. I know it's a lot of work, but fixing that is probably the right think to do.

@cab105
Copy link
Contributor Author

cab105 commented Dec 21, 2020

@cab105 a general observation that didn't fit in the rest of my comments: the service is called CloudWatch Logs (plural), but I see cloudwatchlog (singular) scattered across the code, including the public facing Kubernetes types. I know it's a lot of work, but fixing that is probably the right think to do.

This is the one area that I was weighing given the naming of awscloudwatchlogsource vs awscloudwatchlogssource. However I did make this update.

@antoineco
Copy link
Contributor

given the naming of awscloudwatchlogsource vs awscloudwatchlogssource

It definitely looks ugly :D I was in favor of not appending "Source" to types that are already in the "sources.triggermesh.io" API, but didn't win this battle.

Copy link
Contributor

@antoineco antoineco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: 👏

@antoineco
Copy link
Contributor

The failing test doesn't seem to make sense. I ran both tests locally and they passed every time.

@cab105
Copy link
Contributor Author

cab105 commented Dec 21, 2020

The failing test doesn't seem to make sense. I ran both tests locally and they passed every time.

I'm not a fan of failing tests, even for transient cases. Even looking at the diff for CircleCI, the output matches, which makes me wonder if it is comparing pointers instead of the underlying objects.

@antoineco
Copy link
Contributor

I found the reason for the failure using delve. Seems like the timezone is set after serialization/deserialization to/from CloudEvent, whereas it is nil in the original data:

(dlv) p metricRecord.Timestamps
[]*time.Time len: 1, cap: 4, [
        *{
                wall: 0,
                ext: 62135596800,
                loc: *time.Location nil,},
]

(dlv) p (*metricOutput.MetricDataResults[0]).Timestamps
[]*time.Time len: 1, cap: 1, [
        *{
                wall: 0,
                ext: 62135596800,
                loc: *(*time.Location)(0x32f6bc0),},
]

@antoineco
Copy link
Contributor

antoineco commented Dec 21, 2020

@cab105 aha! Unix() uses the time.Local location (a.k.a /etc/localtime), but the serialization/deserialization defaults it to UTC if the host has no timezone configured at all (like CircleCI).

It's less error prone to pin the timezone in tests:

diff --git a/pkg/adapter/awscloudwatchsource/adapter_test.go b/pkg/adapter/awscloudwatchsource/adapter_test.go
index 53947e3..1a4e1e0 100644
--- a/pkg/adapter/awscloudwatchsource/adapter_test.go
+++ b/pkg/adapter/awscloudwatchsource/adapter_test.go
@@ -98,7 +98,7 @@ func TestCollectMetrics(t *testing.T) {
                val              = float64(37.566818845509246)
        )

-       ts := time.Unix(0, 0)
+       ts := time.Date(1970, 1, 1, 12, 0, 0, 0, time.UTC)

        const (
                dimensionName  = "FunctionName"
@@ -193,7 +193,7 @@ func TestSendMetricEvent(t *testing.T) {
                val              = float64(37.566818845509246) // must keep this cast to ensure proper [de]serialization
        )

-       ts := time.Unix(0, 0)
+       ts := time.Date(1970, 1, 1, 12, 0, 0, 0, time.UTC)

        metricOutput := cloudwatch.GetMetricDataOutput{
                Messages: nil,

To be clear, it isn't an issue with the serialization, but rather with the way DeepEqual() compares time.Time, i.e. t1.Equal(t2) was returning true but DeepEqual() is type agnostic and compares things literally field by field.

@cab105 cab105 merged commit c6859ea into master Dec 22, 2020
@cab105 cab105 deleted the aws-cloudwatch branch December 22, 2020 00:02
@antoineco antoineco linked an issue Dec 28, 2020 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a source for CloudWatch
2 participants