Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local Monorepo Build Orchestration #1121

Open
rydrman opened this issue Sep 4, 2024 · 9 comments
Open

Local Monorepo Build Orchestration #1121

rydrman opened this issue Sep 4, 2024 · 9 comments

Comments

@rydrman
Copy link
Collaborator

rydrman commented Sep 4, 2024

Is your feature request related to a problem? Please describe.

At ILM, we have a distinct workflow where many packages are maintained from a single repo (not currently spk). Our developer and CI workflows operate on "collections" that construct a simple dependency DAG and then build packages that don't exist in order. Notably, the package recipes work for all possible versions of the package and so the version is provided by the collection, not the package spec.

Describe the solution you'd like

We would like to convert these packages into spk packages and then have a way to orchestrate builds in the same way using spk.

Describe alternatives you've considered

We could hook spk into our existing system, eg build the DAG ourselves and then invoke spk, but resolving the dependencies properly in such a DAG is difficult due to spk spec files being difficult to introspect from the outside, especially as a set of var options are included in the picture.

@rydrman rydrman added the agenda item Items to be brought up at the next dev meeting label Sep 4, 2024
@rydrman rydrman changed the title Local Monorepo Build Graph Local Monorepo Build Orchestration Sep 4, 2024
@rydrman
Copy link
Collaborator Author

rydrman commented Sep 11, 2024

From the meeting today:

  • related to Add --tests flag to spk info #1120 in that we could possibly do this with more useful introspection of spk recipes
  • also related to other issues orchestrating variant builds in CI and properly determining what needs to be done
  • we kind of have this if we do build from source with published packages, though there is no great way to interpret the output of spk explain
    • this can't handle versions, so we could instead inject the version in order to create the source packages and then run spk
  • this is also related to the package specs and build process that we have in the spk repo itself. This is a good first use case for related functionality - having it be able to build the tree on its own without Make.

@rydrman
Copy link
Collaborator Author

rydrman commented Nov 20, 2024

From the meeting today:

  • a "workspace" concept to allow spec discovery and some top-level configuration
  • extension of the existing platform spec to better support build-time vs run-time usage
    • at build time you either put a request or specify an exact package version (in case the workspace has templated recipes that you want to build)
    • at run time being able to leverage components to have different requests from the build ones (build is usually lowest-supportable, run is usually highest-compatible)
    • a more complete proposal from this discussion would be a good next step

@rydrman rydrman added needs proposal A good idea that is ready for a more concrete implementation plan and removed agenda item Items to be brought up at the next dev meeting labels Nov 20, 2024
@rydrman
Copy link
Collaborator Author

rydrman commented Nov 21, 2024

Here's a WIP proposal for the yaml structure to keep the conversation going

api: v0/workspace

# a set of package recipes can be used to build new versions
# of packages as needed by the orchestration process
recipes: 
  # either globs or individual paths can be used to collect recipes
  - packages/*/*.spk.yml
  # these can either be platform recipes or package recipes that
  # might be referenced by builds in the workspace
  - platforms/*/*.spk.yml
api: v1/platform
platform: dcc-platform
base:
    # the ilm workflow requires inheriting from multiple bases
    # so we update to support a list in this position, order matters
    - company-platform
    - exr-platform

# the requirements of a platform are used to generate
# 'ifAlreadyPresent' requests under build- and run-time
# another name for this concept might be 'constraints' or
# 'opinions' if we are concerned about the deviation of 
# the structure from package requirements
requirements:
  # the entry structure takes inspiration from the designs from
  # https://github.com/spkenv/spk/pull/296 
  - pkg: gcc/9.3.1
  
  # when building all packages from a platform, the value from the 
  # primary pkg line is used to template package specs and so should
  # be a complete version number if that functionality is desired
  - pkg: gcc/9.3.1 # tries to template & build gcc/9.3.1
  
  # separate requests for use in the build and run components of this
  # platform are created from the same primary version but can be
  # overridden by any version range specifier as appropriate. To make the
  # above build process work, I think that would would need to avoid
  # accepting range specifiers in the main pkg line.
  - pkg: gcc/9.3.1
    atBuild: >=9.3.1
    atRuntime: ~9.3

  # using false in either scenario stops the request from
  # being added to that component, and would also effectively 
  # remove the request if it had come from the base
  - pkg: imath
    atBuild: false

@rydrman
Copy link
Collaborator Author

rydrman commented Dec 4, 2024

From the meeting today:

  • possibly separate pkg: name/version into separate field to make it clearer that it is the version to be build (what would we call it)
  • could it make sense to have something about default targets or available targets within the workspace file? or having it all in platforms still makes sense?
  • we need better language to separate the current building of platforms from the new 'build all the packages for this platform'
  • how does this help with bootstrapping platforms? Eg ensure that new platform specs can be defined and used to build packages that want to also rely on that platform for getting its build environment

@rydrman
Copy link
Collaborator Author

rydrman commented Dec 13, 2024

Okay, I'm rethinking the approach here a little bit - based on our discussion and just imagining the workflows:

api: v0/workspace

recipes:
  # collect all of the recipes in the workspace
  - packages/**.spk.yaml

  # some recipes require additional information
  # which can be augmented even if they were already
  # collected above

  - path: packages/python/python2.spk.yaml
    # here, we define the specific versions that can
    # be build from a recipe
    versions: [2.7.18]

  - path: packages/python/python3.spk.yaml
    # we can use bash-style brace expansion to define
    # ranges of versions that are supported
    versions:
      - '3.7.{0..17}'
      - '3.8.{0..20}'
      - '3.9.{0..21}'
      - '3.10.{0..16}'
      - '3.11.{0..11}'
      - '3.12.{0..8}'
      - '3.13.{0..1}'

My goal is to get something going that will be able to build a couple of consistent sets of packages from the current repo as a start. In this setup, the platforms don't factor into this any more than simply being used as dependencies like they are already, but I'm not confident that this is the final form of this feature yet, either.

This was referenced Dec 17, 2024
@davvid
Copy link
Collaborator

davvid commented Dec 28, 2024

Here's a few thoughts/questions that might influence the design or point out missing features.

  • Q: Is this design limited to being able to build packages on a single node?

  • A(?): An orchestrating process could build individual packages from the workspace by issuing specific spk mkb commands from the workspace.


*If we were to attempt to build packages across multiple nodes, where each machine has its own local spfs repository but shares a central origin repository, do we expose enough metadata, package details and information in order to make it possible?

I don't know the answer to this one. In the abstract, it seems like an orchestration process would need to be able to query the following details:

  • The full set of packages that need to be built. How can we query this information?
    • Furthermore, differentiating between packages that we already have prebuilt and ones that are yet to be built seems like a key query.
  • The Build dependency edges between these packages for the purposes of scheduling. i.e. we need to know that we should build package A version AV with options AO before we build package B version BV with options BO.

An orchestration process could basically work by querying for the packages that have not yet been built, determining which ones have all of their dependencies already satisfied, and then running spk build commands for each of these packages on remote hosts. This would allow us to scale up to building packages in parallel across multiple nodes.

The builder node may not want to publish the package to origin, so one strategy may be to copy the *.spk exported layers back to the orchestration node. Or, would this be better doing using an ephemeral spk remote for the build event?

If we are shuffling *.spk files around then the remote nodes could in theory start each step of their build process by importing the previously-built spk files into their local spfs repository.

The orchestration process would perform these steps in a loop until no packages remain to be built or until a build error occurs. At this point the build is either Complete or Failed.

This design implies that the query for the set of packages to build should be an incremental query that can be performed against the origin + local repository.

It doesn't seem like anything in the v0/workspace spec prevents this design, but are there any missing features or tooling that would be needed in order to make a fully distributed, multi-node build possible?

@rydrman
Copy link
Collaborator Author

rydrman commented Dec 30, 2024

Let me lay out my initial thoughts and we can see if there's a gap between what you are hoping for. I'm picturing that this is going to require a new command, something like spk build-tree that would additionally need to be able to output the entire planned tree rather than just building it all. That output could be used to script an integration with any CI system to spread builds over multiple nodes. In terms of orchestrating the moving of packages, I think that is already solved by the spfs namespacing that @jrray is using for a similar multi-node pipeline workflow IIRC. The TL;DR is that by setting the same spfs namespace for all jobs, they can have a shared corner of the existing studio repo that is only visible to that set of jobs. Once the jobs are all complete the packages within that namespace could be published or left alone to be further tested, promoted etc.

Also, your initial answer is correct, the package specs would remain buildable individually just as they would have been before these changes.

@rydrman rydrman removed the needs proposal A good idea that is ready for a more concrete implementation plan label Dec 30, 2024
@davvid
Copy link
Collaborator

davvid commented Dec 31, 2024

Thanks for the details. That helped fill the gaps for me. spk build-tree queries and namespaces sounds like a great solution. The flexibility provided by namespaces unlocks a lot of options.

Here's how I'm understanding this so far. If the namespace feature allows for the use of multiple namespaces for reading/resolving then builds can be setup to resolve packages from both the "default" namespace and a build-specific namespace. As new packages are built they'll get published into the separate build-specific namespace of the local and remote repositories so that downstream builds can access them. That's a very nice solve for the moving-of-packages problem.

Another variation that namespaces unlocks is when you want to rebuild everything completely from scratch without any chance of using packages from the "default" namespace. One way to do that would be to setup the builds so that only the build-specific namespace is used, thus preventing any packages from the "default" namespace from being used.

That approach might not be very useful in practice without also being able to activate the "default" namespace for just its source packages, though. Rebuilding everything from source isn't a very common scenario but it might highlight the utility of being able to enable just the source packages from a particular namespace for resolving separately from its non-source packages.

That seems like a more complex concept (for a less common use case) that can be layered over the core concept of namespaces later, though, so it's probably too soon to consider compared to the core idea of spk build-tree and namespaces. Thanks for the explanation.

@rydrman
Copy link
Collaborator Author

rydrman commented Mar 5, 2025

From the meeting today:

  • still happy with the spec as defined above (and below)
  • SPI will need to consider migrating off of the default "all" component behavior to take advantage but it needs to be done anyway
api: v1/platform
platform: dcc-platform
base:
    # the ilm workflow requires inheriting from multiple bases
    # so we update to support a list in this position, order matters
    - company-platform
    - exr-platform

# the requirements of a platform are used to generate
# 'ifAlreadyPresent' requests under build- and run-time
# another name for this concept might be 'constraints' or
# 'opinions' if we are concerned about the deviation of 
# the structure from package requirements
requirements:
  # the entry structure takes inspiration from the designs from
  # https://github.com/spkenv/spk/pull/296 
  - pkg: gcc/9.3.1
  
  # when building all packages from a platform, the value from the 
  # primary pkg line is used to template package specs and so should
  # be a complete version number if that functionality is desired
  - pkg: gcc/9.3.1 # tries to template & build gcc/9.3.1
  
  # separate requests for use in the build and run components of this
  # platform are created from the same primary version but can be
  # overridden by any version range specifier as appropriate. To make the
  # above build process work, I think that would would need to avoid
  # accepting range specifiers in the main pkg line.
  - pkg: gcc/9.3.1
    atBuild: >=9.3.1
    atRuntime: ~9.3

  # using false in either scenario stops the request from
  # being added to that component, and would also effectively 
  # remove the request if it had come from the base
  - pkg: imath
    atBuild: false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants