Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official Air pre-commit hook #269

Open
DavisVaughan opened this issue Feb 27, 2025 · 9 comments
Open

Official Air pre-commit hook #269

DavisVaughan opened this issue Feb 27, 2025 · 9 comments

Comments

@DavisVaughan
Copy link
Collaborator

https://github.com/pre-commit/pre-commit

Pre-commit itself seems to be written in python, so requires python to install
https://pre-commit.com/#install

Ruff example: https://github.com/astral-sh/ruff-pre-commit

I think the vibe of how the Ruff one works is that they have this pre commit hook yaml:
https://github.com/astral-sh/ruff-pre-commit/blob/main/.pre-commit-hooks.yaml

Note the ruff format --force-exclude call.

And I think ruff itself gets installed from being listed as a requirement here:
https://github.com/astral-sh/ruff-pre-commit/blob/main/pyproject.toml

They do tag releases synced to each ruff version
https://github.com/astral-sh/ruff-pre-commit/tags

And this is how a user would reference that tag and call the hook:
https://github.com/astral-sh/ruff/blob/568cf88c6c5b5551a675ae2b13deedec0fe226cb/.pre-commit-config.yaml#L76-L79

@lorenzwalthert
Copy link

lorenzwalthert commented Feb 28, 2025

@DavisVaughan your observations are all correct IMO. Python must not necessarily be available, you can also install pre-commit with brew for example, then it will manage the python installation I think. Note that pre-commit always clones a repo using a tag or a hash from rev: field in the .pre-commit-config.yaml, which is then installed (with all its dependencies).

Unifying approach
Hence, you could also place the .pre-commit-config.yaml file in the posit-dev/air repo directly. Then, if you push new tags to this repo (which I guess you do anyways if you release a new version), pre-commit autoupdate will recognise them and pre-commit autoupdate will pick it up. Note pre-commit autoupdate --bleeding-edge or so uses always the latest commit hash from the default branch, but few people seem to use that.

Splitting approach
You could also follow ruff and create a repo like posit-dev/air-pre-commit. If you don't publish the crates (which I assume you don't currently according to #98), this approach is not available to you I think, since your air-pre-commit repo can't list air as a dependency that can be resolved and installed by a package manager for rust. Here's how you'd make hook updates available to your users:

  • bump the dependency of the posit-dev/air-pre-commit on air.
  • push a new tag corresponding to the new version to posit-dev/air-pre-commit to ensure that pre-commit autoupdate fetches the latest version of the hook.

There are many examples where the pre-commit hooks configs are directly in the root of the actual project. I'd say considerations pro and con of unifying vs splitting:

  • easier update process if unifying (as described above).
  • git clone a lightweight git repo and then installing air as a dependency from a package server like pypi where potentially binaries are available (splitting) is probably faster than cloning.
  • more flexibility with splitting (as repos can evolve separately).

@DavisVaughan
Copy link
Collaborator Author

Noting that what we really need is something like this pre-commit/pre-commit#1453 to download pre installed binaries using pre-commit

@DavisVaughan
Copy link
Collaborator Author

DavisVaughan commented Feb 28, 2025

IIUC with Unifying Approach we'd set language: rust and pre-commit would cargo install --bins on the fly for our users (even bootstrapping rustup as required). While this is nice, I think we should not do that because we've gone through all the work of making easy to install binaries - we should use them if at all possible.

So that likely leaves us with Splitting Approach, where we need to decide which language to use. A few ideas:

language: R

Create a pseudo-R package at posit-dev/air-pre-commit. The R package would be in charge of downloading the correct binary depending on the platform, and then running it. This would be pretty simple I think, we just need to map the OS and Architecture to the right GitHub Artifact URL

https://github.com/posit-dev/air/releases/download/<VERSION>/air-<ARCH>-<OS>.tar.gz

For VERSION, we'd put a constant in the package as VERSION <- 0.4.1 which would correspond to the tag that we release at, so when people set rev: 0.4.1 in their pre-commit-config.yml they get the right version.

I'm not exactly sure how caching of the binary between commits would work here. It sounds like maybe it is up to the R package, and we could use something like rappdirs for the location.

IIUC pre-commit would build the R package the first time around and then reuse the built package for subsequent calls until the tag is updated

language: Python

Be like ruff and compile binary wheels and upload an air python package to pypi that just makes air available. Don't love this idea as it is a lot of extra work on our end for little benefit.

language: system

https://pre-commit.com/#system

We could write a "system" hook that assumes that the user has already installed the binary. In this case:

  • We could manage it through the posit-dev/air repo (but there are still arguments for keeping it separate, lightweight clone times for example)
  • It would be very fast as we don't have to wait for R to start up and shut down
  • But it would go against the ethos of pre-commit a bit by requiring that the air binary already be installed
  • It does not let you easily pin a version, which seems like a footgun

I don't mind this that much, and if language: download ever gets implemented we could switch to that.

@lorenzwalthert
Copy link

lorenzwalthert commented Feb 28, 2025

So that likely leaves us with Splitting Approach, where we need to decide which language to use. A few ideas:

With both unifying and splitting, you will point to a git repo and specify a language property, that's not unique to splitting. If you choose language: rust , the git repo pre-commit will install from has to be installable with cargo (into some pre-commit managed directory in your system's cache, not the working directory). So if you have language: rust:

  • With the unifying approach, the git repo from posit-dev/air is cloned and installed from source. The package has to expose the CLI that you then put into entry:.
  • With the splitting approach, the rust package at posit-dev/air-pre-commit exists for the mere reason of having a dependency on the package you actually want to use and installing that dependency (with a package manager from a remote package repo) will expose a CLI program like ruff that you then put into entry: field (FWIW for language: r it's not a CLI program, but for many other languages it is). This in my understanding would not work currently, since you don't distribute air through standard rust package repositories (like CRAN for R).

To sum it up: whether or not you choose a splitting or unifying approach, you need to choose a pre-commit language. But if you choose unifying approach and language:r, posit-dev/air has to be a rust and R Package at the same time which brings additional complexity you want to avoid.

@lorenzwalthert
Copy link

lorenzwalthert commented Feb 28, 2025

On which pre-commit language to use: pre-commit supports standard package management mechanisms for R, rust, python and other languages. If airs usage is limited to the command line and positron / RStudio, you may not see a lot of benefit distributing it through standard package repos like CRAN, pypi etc. {styler} has been used in many different third-party tools that want to format code and I think it was beneficial to have it installable as an R package through standard mechanisms. But that's up to you. In that regard, the fact that {styler} is a pure R implementation is also a benefit. Also, a number of R users don't feel comfortable leaving the R console and an R API was certainly beneficial for {styler}.

If you go for language: r, you get the side benefit of R bindings for other use cases than just pre-commit. As per the docs, the repo needs to use {renv} to manage dependencies (none in your case). As you correctly point out, you would update the tag of the repo in sync with the pointer to the binary in https://github.com/posit-dev/air/releases/download/<VERSION>/air-<ARCH>-<OS>.tar.gz in the R package source code (or just depend on DESCRIPTION to read off the version value that you should keep in sync with the tag).

I'm not exactly sure how caching of the binary between commits would work here. It sounds like maybe it is up to the R package, and we could use something like rappdirs for the location.

There is one virtual environment per hook repo and rev for 2nd class languages and lower, so if you use language: r or language: rust, a new virtual environment is created if rev: changes as a consequence of pre-commit autoupdate. For R specifically, there is also a health check, which means if the R version changes (and R packages installed with previous R version are not available anymore), the repo will be re-installed.

IIUC pre-commit would build the R package the first time around and then reuse the built package for subsequent calls until the tag is updated

Correct.

language: system indeed seems easy but problematic, and language: r is quite slow, which kind of defeats the purpose of air 😀. I'd go for language: rust, I don't think the overhead with rust bootstrapping is that big or complicated for the user, in particular if they use other hooks already (like ruff). Plus it's much easier to manage on your side, in particular with the unified approach.
Also, you can even expose multiple hooks in the hook repo, one with rust and one with system. The entry: can be the same. The user can also override language from .pre-commit-hook.yaml in the hook repo in his .pre-commit-config.yaml but it's not really recommended.

@lorenzwalthert
Copy link

Sorry my comments get so long. Hope they are not too convoluted.

@SamEdwardes
Copy link

In lieu of an official pre-commit hook, you can use this if you already have air installed:

repos:
    # R
    - repo: local
      hooks:
          # https://github.com/posit-dev/air
          - id: air
            name: Format R code with air
            entry: air format .
            language: system
            files: \.R$

@lorenzwalthert
Copy link

@SamEdwardes without pass_filenames: False, I think you are formatting the working directory and the files staged for commit. I think you probably just want to format the staged files, so probably remove the . in air format .?

@SamEdwardes
Copy link

Thank you for the suggestion! I am new to pre-commit and appreciate the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants