Skip to content

Commit

Permalink
stats: find regex prefixes more aggressively, taking into account ign…
Browse files Browse the repository at this point in the history
…ored dot token (envoyproxy#2630)

This PR expands the scope of regexes where we can parse out a prefix. This has a dramatic effect on the annotated perf information available with envoyproxy#2615 patched in

Before:

Duration(us)  # Calls  per_call(ns)  Category / Description
     1209259   680147          1777  re-miss / envoy.grpc_bridge_method
     1188780   680147          1747  re-miss / envoy.grpc_bridge_service
     1188206   680147          1746  re-miss / cipher_suite
      683323   680127          1004  re-miss / envoy.response_code_class
      671907   680147           987  re-miss / envoy.response_code
      639791   680147           940  re-miss / envoy.dynamo_partition_id
      628738   680147           924  re-miss / envoy.dynamo_operation
      623747   680147           917  re-miss / envoy.mongo_callsite
      623687   680031           917  re-miss / envoy.http_conn_manager_prefix
      620776   680147           912  re-miss / envoy.http_user_agent
      619797   680147           911  re-miss / envoy.dyanmo_table
      614521   680147           903  re-miss / envoy.mongo_collection
      609383   680147           895  re-miss / envoy.mongo_cmd
      606698   680147           892  re-miss / envoy.ssl_cipher
      606616   680147           891  re-miss / envoy.fault_downstream_cluster
      605605   680147           890  re-miss / envoy.virtual_cluster
      445298   680000           654  re-match / envoy.cluster_name
          67      116           577  re-match / envoy.http_conn_manager_prefix
          20       20          1031  re-match / envoy.response_code_class
          11        9          1237  re-miss / envoy.listener_address
           4        5           886  re-match / envoy.listener_address
After:

Duration(us)  # Calls  per_call(ns)  Category / Description
     1200107   680000          1764  re-miss / envoy.grpc_bridge_method
     1183043   680000          1739  re-miss / cipher_suite
     1178935   680000          1733  re-miss / envoy.grpc_bridge_service
      678943   680127           998  re-miss / envoy.response_code_class
      668008   680147           982  re-miss / envoy.response_code
      617447   680031           907  re-miss / envoy.http_conn_manager_prefix
      446239   680000           656  re-match / envoy.cluster_name
         186      106          1760  re-miss / envoy.dynamo_partition_id
         177      106          1678  re-miss / envoy.dyanmo_table
         171      106          1621  re-miss / envoy.dynamo_operation
         166      106          1568  re-miss / envoy.http_user_agent
         163      106          1543  re-miss / envoy.fault_downstream_cluster
          68      116           591  re-match / envoy.http_conn_manager_prefix
          25       14          1849  re-miss / envoy.ssl_cipher
          20       20          1005  re-match / envoy.response_code_class
          11        9          1232  re-miss / envoy.listener_address
Note that many fewer regexes need to be evaluated, although the really expensive ones are still examined very often. They need to be evaluated less often or made to be cheaper, preferably both.

Risk Level: Low

Release Notes: N/A

Signed-off-by: Joshua Marantz <[email protected]>
  • Loading branch information
jmarantz authored and htuch committed Feb 28, 2018
1 parent 0888f3f commit 2a81a8f
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 1 deletion.
4 changes: 4 additions & 0 deletions STYLE.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,10 @@
annotations](https://github.com/abseil/abseil-cpp/blob/master/absl/base/thread_annotations.h),
such as `GUARDED_BY`, should be used for shared state guarded by
locks/mutexes.
* Functions intended to be local to a cc file should be declared in an anonymonus namespace,
rather than using the 'static' keyword. Note that the
[Google C++ style guide](https://google.github.io/styleguide/cppguide.html#Unnamed_Namespaces_and_Static_Variables)
allows either, but in Envoy we prefer annonymous namespaces.

# Error handling

Expand Down
6 changes: 5 additions & 1 deletion source/common/stats/stats_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ size_t roundUpMultipleNaturalAlignment(size_t val) {
return (val + multiple - 1) & ~(multiple - 1);
}

bool regexStartsWithDot(absl::string_view regex) {
return absl::StartsWith(regex, "\\.") || absl::StartsWith(regex, "(?=\\.)");
}

} // namespace

size_t RawStatData::size() {
Expand Down Expand Up @@ -75,7 +79,7 @@ std::string TagExtractorImpl::extractRegexPrefix(absl::string_view regex) {
if (!absl::ascii_isalnum(regex[i]) && (regex[i] != '_')) {
if (i > 1) {
const bool last_char = i == regex.size() - 1;
if ((!last_char && (regex[i] == '\\') && (regex[i + 1] == '.')) ||
if ((!last_char && regexStartsWithDot(regex.substr(i))) ||
(last_char && (regex[i] == '$'))) {
prefix.append(regex.data() + 1, i - 1);
}
Expand Down
1 change: 1 addition & 0 deletions test/common/stats/stats_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,7 @@ TEST(TagExtractorTest, ExtractRegexPrefix) {

EXPECT_EQ("", extractRegexPrefix("^prefix(foo)."));
EXPECT_EQ("prefix", extractRegexPrefix("^prefix\\.foo"));
EXPECT_EQ("prefix_optional", extractRegexPrefix("^prefix_optional(?=\\.)"));
EXPECT_EQ("", extractRegexPrefix("^notACompleteToken")); //
EXPECT_EQ("onlyToken", extractRegexPrefix("^onlyToken$")); //
EXPECT_EQ("", extractRegexPrefix("(prefix)"));
Expand Down

0 comments on commit 2a81a8f

Please sign in to comment.