You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now we have simple scheduler which just pick one hardcoded cluster (context) and submit job to it.
To scale up and compile many kernels at once we need to support multiple clusters (contexts) in config and distribute jobs between them.
As issue is a bit urgent, i suggest to do it in 2 steps:
First step to implement it simplest way possible, e.g. just support in runtime multiple contexts in config and pick one of them randomly.
Second step, after other teams can start working on other issues and we can spend more time on more elegant design - we need to design and implement more sophisticated logic to pick context based on kernel "size" + cluster "size" + load on cluster.
Several considerations:
1)Some kernels require "big" clusters(RAM and CPU capacity), which we have at moment only 2. This is at least "allmodconfig" and "gki_defconfig" kernels.
2)In future we might need to monitor load on cluster and pick context based on it. So for example if cluster have too many jobs in pending state it is better to pick another cluster.
3)We need to have some kind of "weight" for each cluster, that will provide us approximate computing capacity of cluster.
The only critical requirement is N1, as allmodconfig kernel will just cause OOM on small cluster.
The text was updated successfully, but these errors were encountered:
Right now we have simple scheduler which just pick one hardcoded cluster (context) and submit job to it.
To scale up and compile many kernels at once we need to support multiple clusters (contexts) in config and distribute jobs between them.
As issue is a bit urgent, i suggest to do it in 2 steps:
First step to implement it simplest way possible, e.g. just support in runtime multiple contexts in config and pick one of them randomly.
Main issue how to convert:
(We will have 5 clusters in total)
So when we hit this runtime it will evenly distribute the load across available clusters.
I suggest following:
1)support multiple contexts in config
2)Use random function and pick one of the contexts based on random value
e.g.
Second step, after other teams can start working on other issues and we can spend more time on more elegant design - we need to design and implement more sophisticated logic to pick context based on kernel "size" + cluster "size" + load on cluster.
Several considerations:
1)Some kernels require "big" clusters(RAM and CPU capacity), which we have at moment only 2. This is at least "allmodconfig" and "gki_defconfig" kernels.
2)In future we might need to monitor load on cluster and pick context based on it. So for example if cluster have too many jobs in pending state it is better to pick another cluster.
3)We need to have some kind of "weight" for each cluster, that will provide us approximate computing capacity of cluster.
The only critical requirement is N1, as allmodconfig kernel will just cause OOM on small cluster.
The text was updated successfully, but these errors were encountered: