Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEZ-4527: Add generic and pluggable hooks for DAGs and task attempts #324

Merged
merged 8 commits into from
Dec 22, 2024

Conversation

okumin
Copy link
Contributor

@okumin okumin commented Dec 18, 2023

@okumin okumin changed the title TEZ-4527: Add generic and pluggable hooks for DAGs and task attempts [WIP] TEZ-4527: Add generic and pluggable hooks for DAGs and task attempts Dec 18, 2023
@okumin okumin changed the title [WIP] TEZ-4527: Add generic and pluggable hooks for DAGs and task attempts TEZ-4527: Add generic and pluggable hooks for DAGs and task attempts Dec 18, 2023
@okumin okumin marked this pull request as ready for review December 18, 2023 12:43
@tez-yetus

This comment was marked as outdated.

*/
@ConfigurationScope(Scope.DAG)
@ConfigurationProperty
public static final String TEZ_THREAD_DUMP_INTERVAL = "tez.thread.dump.interval";
public static final String TEZ_THREAD_DUMP_INTERVAL_DEFAULT = "0ms";
public static final String TEZ_THREAD_DUMP_INTERVAL_DEFAULT = "100ms";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce pluggable hooks, I think we can change the default value. We may remove NOOP_TEZ_THREAD_DUMP_HELPER, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense
I think for package clarity's sake, all the hook related configs can go a namespace that implies they're hooks:

tez.hook.thread.dump.internal

also:

TEZ_HOOK_THREAD_DUMP_INTERVAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the NoopTezThreadDumpHelper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tez-yetus

This comment was marked as outdated.

@tez-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ master Compile Tests _
+0 🆗 mvndep 6m 12s Maven dependency ordering for branch
+1 💚 mvninstall 12m 13s master passed
+1 💚 compile 2m 11s master passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 compile 2m 2s master passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 checkstyle 2m 3s master passed
+1 💚 javadoc 1m 51s master passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javadoc 1m 38s master passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+0 🆗 spotbugs 1m 19s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 16s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 1m 20s the patch passed
+1 💚 compile 1m 26s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javac 1m 26s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 javac 1m 15s the patch passed
+1 💚 checkstyle 0m 51s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 52s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javadoc 0m 52s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 findbugs 3m 36s the patch passed
_ Other Tests _
+1 💚 unit 2m 16s tez-api in the patch passed.
+1 💚 unit 0m 25s tez-common in the patch passed.
+1 💚 unit 0m 38s tez-runtime-internals in the patch passed.
+1 💚 unit 4m 55s tez-dag in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
53m 56s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/3/artifact/out/Dockerfile
GITHUB PR #324
JIRA Issue TEZ-4527
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux c3ccbb36f8b9 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / 0c5cf68
Default Java Private Build-1.8.0_392-8u392-ga-1~22.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~22.04-b08
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/3/testReport/
Max. process+thread count 457 (vs. ulimit of 5500)
modules C: tez-api tez-common tez-runtime-internals tez-dag U: .
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/3/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@@ -2207,7 +2210,9 @@ public Void run() throws Exception {
}

// Check if the thread dump service is up in any case, if yes attempt a shutdown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove thread dump helper related comment, and change to a more generic one that tells we're about to stop hooks if they are running in any case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done
943012a

@abstractdog
Copy link
Contributor

@okumin : thanks for the patch, nice refactor, only minor comments, other than that, it looks good to me!

@okumin
Copy link
Contributor Author

okumin commented Dec 21, 2024

Thanks. I think all the points follow your suggestions. I rebased the branch as the original one was already obsolete

@tez-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 25m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+0 🆗 mvndep 2m 53s Maven dependency ordering for branch
+1 💚 mvninstall 13m 33s master passed
+1 💚 compile 2m 53s master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 compile 2m 37s master passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 checkstyle 2m 38s master passed
+1 💚 javadoc 2m 21s master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javadoc 2m 6s master passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+0 🆗 spotbugs 0m 50s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 5m 11s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 1m 43s the patch passed
+1 💚 compile 1m 51s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javac 1m 51s the patch passed
+1 💚 compile 1m 34s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 javac 1m 34s the patch passed
-0 ⚠️ checkstyle 0m 10s tez-runtime-internals: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 1m 2s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javadoc 1m 1s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 findbugs 4m 9s the patch passed
_ Other Tests _
+1 💚 unit 2m 15s tez-api in the patch passed.
+1 💚 unit 0m 26s tez-common in the patch passed.
+1 💚 unit 0m 47s tez-runtime-internals in the patch passed.
+1 💚 unit 4m 49s tez-dag in the patch passed.
+1 💚 unit 40m 54s tez-tests in the patch passed.
+1 💚 asflicense 0m 55s The patch does not generate ASF License warnings.
125m 6s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/4/artifact/out/Dockerfile
GITHUB PR #324
JIRA Issue TEZ-4527
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux cebf547baff1 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / ca15119
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/4/artifact/out/diff-checkstyle-tez-runtime-internals.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/4/testReport/
Max. process+thread count 1203 (vs. ulimit of 5500)
modules C: tez-api tez-common tez-runtime-internals tez-dag tez-tests U: .
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/4/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@@ -70,21 +71,17 @@ private TezThreadDumpHelper(long duration, Configuration conf) throws IOExceptio
"path: {}", duration, basePath);
}

public TezThreadDumpHelper() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, cannot recall what the purpose was of this constructor, does reflection work without this explicitly defined?
I'm afraid that as there is private parameterized constructor, class.newInstance() throws an InstantiationException, doesn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is originally needed to instantiate

private static class NoopTezThreadDumpHelper extends TezThreadDumpHelper {
@Override
public TezThreadDumpHelper start(String name) {
// Do Nothing
return this;
}
@Override
public void stop() {
// Do Nothing
}
}
with zero arguments.

I think the class is not constructed in a reflective way, or it doesn't assume it's reflectively operated. I slightly updated the modifiers to make sure it
61d8249

Copy link
Contributor

@abstractdog abstractdog Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I was wrong, the hooks are created by reflection, but the TezThreadDumpHelper is not

    helper = TezThreadDumpHelper.getInstance(conf).start(id.toString());

*/
@ConfigurationScope(Scope.DAG)
@ConfigurationProperty
public static final String TEZ_THREAD_DUMP_INTERVAL = "tez.thread.dump.interval";
public static final String TEZ_THREAD_DUMP_INTERVAL_DEFAULT = "0ms";
public static final String TEZ_HOOK_THREAD_DUMP_INTERVAL = "tez.hook.thread.dump.interval";
Copy link
Contributor

@abstractdog abstractdog Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@okumin : I'm terribly sorry, I just realized that changing this causes more problems than benefits (changing config opts from one release to another), the class name also doesn't have "hook" in it, so it's fine to have this as "tez.thread.dump.interval", are you fine with changing back? TEZ_THREAD_DUMP_INTERVAL was also fine from this point of view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np. I renamed them back
293ce63

@abstractdog
Copy link
Contributor

Thanks. I think all the points follow your suggestions. I rebased the branch as the original one was already obsolete

thanks @okumin , this is very close, just left 2 comments

@abstractdog abstractdog self-requested a review December 22, 2024 09:44
Copy link
Contributor

@abstractdog abstractdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@tez-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+0 🆗 mvndep 2m 53s Maven dependency ordering for branch
+1 💚 mvninstall 13m 20s master passed
+1 💚 compile 2m 53s master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 compile 2m 37s master passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 checkstyle 2m 38s master passed
+1 💚 javadoc 2m 24s master passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javadoc 2m 6s master passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+0 🆗 spotbugs 0m 47s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 5m 7s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 1m 40s the patch passed
+1 💚 compile 1m 48s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javac 1m 48s the patch passed
+1 💚 compile 1m 36s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 javac 1m 36s the patch passed
-0 ⚠️ checkstyle 0m 10s tez-runtime-internals: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 1m 3s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
+1 💚 findbugs 4m 14s the patch passed
_ Other Tests _
+1 💚 unit 2m 14s tez-api in the patch passed.
+1 💚 unit 0m 25s tez-common in the patch passed.
+1 💚 unit 0m 48s tez-runtime-internals in the patch passed.
+1 💚 unit 4m 49s tez-dag in the patch passed.
+1 💚 unit 42m 22s tez-tests in the patch passed.
+1 💚 asflicense 0m 57s The patch does not generate ASF License warnings.
100m 52s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/5/artifact/out/Dockerfile
GITHUB PR #324
JIRA Issue TEZ-4527
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux 972ccfe046fc 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / ca15119
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu222.04-ga
checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/5/artifact/out/diff-checkstyle-tez-runtime-internals.txt
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/5/testReport/
Max. process+thread count 1173 (vs. ulimit of 5500)
modules C: tez-api tez-common tez-runtime-internals tez-dag tez-tests U: .
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-324/5/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@abstractdog abstractdog merged commit 1084699 into apache:master Dec 22, 2024
4 checks passed
@okumin okumin deleted the TEZ-4527-hook branch December 22, 2024 12:48
@okumin
Copy link
Contributor Author

okumin commented Dec 22, 2024

Thank you. This change is so helpful for us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants