Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(query) Implement @modifier. #1678

Merged
merged 14 commits into from
Nov 17, 2023
Merged

feat(query) Implement @modifier. #1678

merged 14 commits into from
Nov 17, 2023

Conversation

yu-shipit
Copy link
Contributor

@yu-shipit yu-shipit commented Sep 29, 2023

  1. Override range param based on @modifier timestamp for PeriodicSamplesMapper.
  2. Add RepeatTransformer and RepeatValueVector to populate data for all timestamps.
  3. Add GRPC functions for RepeatValueVector.

Pull Request checklist

  • The commit(s) message(s) follows the contribution guidelines ?
  • Tests for the changes have been added (for bug fixes / features) ?
  • Docs have been added / updated (for bug fixes / features) ?

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)

New behavior :
Will support @modifier.
The problem is described in this doc.
The prometheus PR.
Filodb will support advance usage such as

rate(process_cpu_seconds_total[1m]) 
  and
topk(7, rate(process_cpu_seconds_total[1h] @ 1234))

Other information:

@yu-shipit
Copy link
Contributor Author

yu-shipit commented Sep 29, 2023

credits to @alextheimer PR.
#1281

Copy link
Contributor

@alextheimer alextheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @yu-shipit -- just a couple of questions. Also, please add some tests that show the ExecPlans for simple queries and typical @ use-cases.

@yu-shipit yu-shipit requested a review from alextheimer October 10, 2023 01:42
@yu-shipit yu-shipit force-pushed the modifier branch 3 times, most recently from d9786b5 to 1dfc95a Compare October 13, 2023 17:11
Yu Zhang added 4 commits October 17, 2023 16:55
1. Override range param based on @modifier timestamp for PeriodicSamplesMapper.
2. Add RepeatTransformer and RepeatValueVector to populate data for all timestamps.
3. Add GRPC functions for RepeatValueVector.
Comment on lines 274 to 275
atMs.forall(at => (at >= earliestRawTimeWithOffset && startMs >= earliestRawTimeWithOffset)
|| (at < earliestRawTimeWithOffset && endMs < earliestRawTimeWithOffset))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startMs and endMs don't necessarily correlate with the range of data a query will read. Consider the instant query:

rate(foo{...}[14d])  // startMs=endMs

@@ -1837,47 +1837,67 @@ class PlannerHierarchySpec extends AnyFunSpec with Matchers with PlanValidationS
validatePlan(execPlan, expectedPlan)
}

it("Modifier should have PeriodicSamplesMapper wrapped by RepeatTransformer") {
it("Modifier should have PeriodicSamplesMapper wrapped by RepeatTransformer and all data in raw cluster") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case for each possible scenario?

  • query routed entirely to raw (already exists)
  • query routed entirely to DS (already exists)
  • query routed to both (should reject)
  • query start/end are in raw, but lookback extends into DS (should reject)
  • query start/end are in raw, but offset pushes it entirely into DS (ok)
  • [any other important scenarios I'm forgetting]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all these tests 🙏

@@ -1837,47 +1837,67 @@ class PlannerHierarchySpec extends AnyFunSpec with Matchers with PlanValidationS
validatePlan(execPlan, expectedPlan)
}

it("Modifier should have PeriodicSamplesMapper wrapped by RepeatTransformer") {
it("Modifier should have PeriodicSamplesMapper wrapped by RepeatTransformer and all data in raw cluster") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all these tests 🙏

query/src/main/scala/filodb/query/LogicalPlan.scala Outdated Show resolved Hide resolved
@yu-shipit yu-shipit requested a review from alextheimer November 9, 2023 00:55
@yu-shipit yu-shipit force-pushed the modifier branch 2 times, most recently from 8510f58 to 48670b2 Compare November 9, 2023 18:45
Comment on lines +345 to +347
val realScanStartMs = sqww.atMs.getOrElse(sqww.startMs)
val realScanEndMs = sqww.atMs.getOrElse(sqww.endMs)
val realScanStep = sqww.atMs.map(_ => 0L).getOrElse(sqww.stepMs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these instead be something like this?:

val realScanStartMs = sqww.atMs.map(at => at - (sqww.endMs - sqww.startMs)).getOrElse(sqww.startMs)
val realScanEndMs = sqww.atMs.getOrElse(sqww.endMs)
// realScanStep remains unchanged

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow an instant query requires 0 to be step .

atModifierTimestampsWithOffset.isEmpty
}
require(isAtModifierValid, s"@modifier $atModifierTimestampsWithOffset is not supported. Because it queries" +
s"both down sampled and raw data. Please adjust the start and end time if you want to use @modifier.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this instead say "please adjust query parameters"? (An instant query might lookback too far.)

Comment on lines +123 to +124
case w: PeriodicSeriesWithWindowing => (w.startMs, w.stepMs, w.window, w.offsetMs.getOrElse(0L),
w.atMs.getOrElse(0L), w.endMs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-checking: no need to update these start/step/end values like we do for subqueries?

Comment on lines -131 to +132
Some(LogicalPlanUtils.copyLogicalPlanWithUpdatedTimeRange(logicalPlan, TimeRange(newStartMs, boundParams._5)))
Some(LogicalPlanUtils.copyLogicalPlanWithUpdatedTimeRange(logicalPlan, TimeRange(newStartMs, boundParams._6)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside the scope of this PR, but it's probably safe to say it's time for those boundParams to be given their own case class 😆

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. we have one more argument, which change the number to 6.

Comment on lines +587 to +589
val realScanStartMs = lp.atMs.getOrElse(logicalPlanWithoutBucket.startMs)
val realScanEndMs = lp.atMs.getOrElse(logicalPlanWithoutBucket.endMs)
val realScanStepMs: Long = lp.atMs.map(_ => 0L).getOrElse(lp.stepMs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as some other comments: should we keep the original duration between start/end?

override def rows(): RangeVectorCursor = {
import NoCloseCursor._
// If rowReader is empty, iterate nothing.
val it = Iterator.from(0, rowReader.map(_ => stepMs.toInt).getOrElse(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be getOrElse(1) to prevent an infinite loop?

query/src/main/scala/filodb/query/LogicalPlan.scala Outdated Show resolved Hide resolved
@yu-shipit yu-shipit merged commit 0e49dc7 into filodb:develop Nov 17, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants