Commit Graph

9 Commits

Author SHA1 Message Date
Alex Crichton
1efee4abdf Update CI to use GitHub's Merge Queue (#5766)
GitHub recently made its merge queue feature available for use in public
repositories owned by organizations meaning that the Wasmtime repository
is a candidate for using this. GitHub's Merge Queue feature is a system
that's similar to Rust's bors integration where PRs are tested before
merging and only passing PRs are merged. This implements the "not rocket
science" rule where the `main` branch of Wasmtime, for example, is
always tested and passes CI. This is in contrast to our current
implementation of CI where PRs are merged when they pass their own CI,
but the code that was tested is not guaranteed to be the state of `main`
when the PR is merged, meaning that we're at risk now of a failing
`main` branch despite all merged PRs being green. While this has
happened with Wasmtime this is not a common occurrence, however.

The main motivation, instead, to use GitHub's Merge Queue feature is
that it will enable Wasmtime to greatly reduce the amount of CI running
on PRs themselves. Currently the full test suite runs on every push to
every PR, meaning that our workers on GitHub Actions are frequently
clogged throughout weekdays and PRs can take quite some time to come
back with a successful run. Through the use of a Merge Queue, however,
we're able to configure only a small handful of checks to run on PRs
while deferring the main body of checks to happening on the
merge-via-the-queue itself. This is hoped to free up capacity on CI and
overall improve CI times for Wasmtime and Cranelift developers.

The implementation of all of this required quite a lot of plumbing and
retooling of our CI. I've been testing this in an [external
repository][testrepo] and I think everything is working now. A list of
changes made in this PR are:

* The `build.yml` workflow is merged back into the `main.yml` workflow
  as the original reason to split it out is not longer applicable (it'll
  run on all merges). This was also done to fit in the dependency graph
  of jobs of one workflow.

* Publication of the `gh-pages` branch, the `dev` tag artifacts, and
  release artifacts have been moved to a separate
  `publish-artifacts.yml` workflow. This workflow runs on all pushes to
  `main` and all tags. This workflow no longer actually preforms any
  builds, however, and relies on a merge queue or similar being used for
  branches/tags where artifacts are downloaded from the workflow run to
  be uploaded. For pushes to `main` this works because a merge queue is
  run meaning that by the time the push happens all artifacts are ready.
  For release branches this is handled by..

* The `push-tag.yml` workflow is subsumed by the `main.yml` workflow. CI
  for a tag being pushed will upload artifacts to a release in GitHub,
  meaning that all builds must finish first for the commit. The
  `main.yml` workflow at the end now scans commits for the preexisting
  magical marker and pushes a tag if necessary.

* CI is currently a flat list of "run all these jobs" and this is now
  rearchitected to a "fan out" approach where some jobs run to determine
  the next jobs to run which then get "joined" into a finish step. The
  purpose for this is somewhat nuanced and this has implications for CI
  runtime as well. The Merge Queue feature requires branches to be
  protected with "these checks must pass" and then the same checks are
  gates both to enter the merge queue as well as pass the merge queue.
  The saving grace, however, is that a "skipped" check counts as
  passing, meaning checks can be skipped on PRs but run to completion on
  the merge queue. A problem with this though is the build matrix used
  for tests where PRs want to only run one element of the build matrix
  ideally but there's no means on GitHub Actions right now for the
  skipped entries to show up as skipped easily (or not that I know of).
  This means that the "join" step serves the purpose of being the single
  gate for both PR and merge queue CI and there's just more inputs to it
  for merge queue CI. The major consequence of this decision is that
  GitHub's actions scheduling doesn't work out well here. Jobs are
  scheduled in a FIFO order meaning that the job for "ok complete the CI
  run" is queued up after everything else has completed, possibly
  after lots of other CI requests in the middle for other PRs. The hope
  here is that by using a merge queue we can keep CI relatively under
  control and this won't affect merge times too much.

* All jobs in the `main.yml` workflow will not automatically cancel the
  entire run if they fail. Previously this fail-fast behavior was only
  part of the matrix runs (and just for that matrix), but this is
  required to make the merge queue expedient. The gate of the merge
  queue is the final "join" step which is only executed once all
  dependencies have finished. This means, for example, that if rustfmt
  fails quickly then the tests which take longer might run for quite
  awhile before the join step reports failure, meaning that the PR sits
  in the queue for longer than needed being tested when we know it's
  already going to fail. By having all jobs cancel the run this means
  that failures immediately bail out and mark the whole job as
  cancelled.

* A new "determine" CI job was added to determine what CI actually needs
  to run. This is a "choke point" which is scheduled at the start of CI
  that quickly figures out what else needs to be run. This notably
  indicates whether large swaths of ci (the `run-full` flag) like the
  build matrix are executed. Additionally this dynamically calculates a
  matrix of tests to run based on a new `./ci/build-test-matrix.js`
  script. Various inputs are considered for this such as:

  1. All pushes, meaning merge queue branches or release-branch merges,
     will run full CI.
  2. PRs to release branches will run full CI.
  3. PRs to `main`, the most common, determine what to run based on
     what's modified and what's in the commit message.

  Some examples for (3) above are if modifications are made to
  `cranelift/codegen/src/isa/*` then that corresponding builder is
  executed on CI. If the `crates/c-api` directory is modified then the
  CMake-based tests are run on PRs but are otherwise skipped.
  Annotations in commit messages such as `prtest:*` can be used to
  explicitly request testing.

Before this PR merges to `main` would perform two full runs of CI: one
on the PR itself and one on the merge to `main`. Note that the one as a
merge to `main` was quite frequently cancelled due to a merge happening
later. Additionally before this PR there was always the risk of a bad
merge where what was merged ended up creating a `main` that failed CI to
to a non-code-related merge conflict.

After this PR merges to `main` will perform one full run of CI, the one
as part of the merge queue. PRs themselves will perform one test job
most of the time otherwise. The `main` branch is additionally always
guaranteed to pass tests via the merge queue feature.

For release branches, before this PR merges would perform two full
builds - one for the PR and one for the merge. A third build was then
required for the release tag itself. This is now cut down to two full
builds, one for the PR and one for the merge. The reason for this is
that the merge queue feature currently can't be used for our
wildcard-based `release-*` branch protections. It is now possible,
however, to turn on required CI checks for the `release-*` branch PRs so
we can at least have a "hit the button and forget" strategy for merging
PRs now.

Note that this change to CI is not without its risks. The Merge Queue
feature is still in beta and is quite new for GitHub. One bug that
Trevor and I uncovered is that if a PR is being tested in the merge
queue and a contributor pushes to their PR then the PR isn't removed
from the merge queue but is instead merged when CI is successful, losing
the changes that the contributor pushed (what's merged is what was
tested). We suspect that GitHub will fix this, however.

Additionally though there's the risk that this may increase merge time
for PRs to Wasmtime in practice. The Merge Queue feature has the ability
to "batch" PRs together for a merge but this is only done if concurrent
builds are allowed. This means that if 5 PRs are batched together then 5
separate merges would be created for the stack of 5 PRs. If the CI for
all 5 merged together passes then everything is merged, otherwise a PR
is kicked out. We can't easily do this, however, since a major purpose
for the merge queue for us would be to cut down on usage of CI builders
meaning the max concurrency would be set to 1 meaning that only one PR
at a time will be merged. This means PRs may sit in the queue for awhile
since previously many `main`-based builds are cancelled due to
subsequent merges of other PRs, but now they must all run to 100%
completion.

[testrepo]: https://github.com/bytecodealliance/wasmtime-merge-queue-testing
2023-02-16 19:18:42 +00:00
Alex Crichton
7669a96179 Reduce warnings on CI from GitHub Actions (#5083)
* Upgrade our github actions to "node16"

Each github actions run has a lot of warnings about using node12 so this
upgrades our repository to using node16. I'm hoping no other changes are
needed and I suspect other actions we're using are on node12 and will
need further updates, but this should help pin down what's remaining.

* Update `actions/checkout` workflow to `v3`

* Update to `actions/cache@v3`

* Update to `actions/upload-artifact@v3`

* Drop usage of `actions-rs/toolchain`

* Update to `actions/setup-python@v4`

* Update mdbook version
2022-10-20 23:11:38 -05:00
Alex Crichton
63c9e5d46d Allow empty commits for the release (#4927)
The release process failed last night due to me filling out the dates in
the release notes early (rather than leaving "Unreleased") which mean
there were no changes for each commit. Switch to passing `--allow-empty`
when making a commit to prevent this.
2022-09-20 14:45:18 +00:00
Alex Crichton
839c4cce17 Remove the 'skip ci' annotation from the release process (#4476)
With branch protections enabled that would otherwise mean that the PR
cannot be landed since CI is now required to run. These date-update PRs
typically come at odd off-hours for Wasmtime anyway so it should be fine
to run CI.
2022-07-20 11:26:32 -05:00
Alex Crichton
99e9e1395d Update more workflows to only this repository (#4062)
* Update more workflows to only this repository

This adds `if: github.repository == 'bytecodealliance/wasmtime'` to a
few more workflows related to the release process which should only run
in this repository and no other (e.g. forks).

* Also only run verify-publish in the upstream repo

No need for local deelopment to be burdened with ensuring everything is
actually publish-able, that's just a concern for the main repository.

* Gate workflows which need secrets on this repository
2022-04-21 11:45:48 -05:00
Alex Crichton
bea0433b54 Fix the release process's latest step (#4055)
* Fix the release process's latest step

The automated release of 0.36.0 was attempted last night but it failed
due to a [failure on CI][bad]. This failure comes about because it was
trying to change the release date of 0.35.0 which ended up not modifying
any fails so `git` failed to commit as no files were changed.

The original bug though was that 0.35.0 was being changed instead of
0.36.0. The reason for this is that the script used
`--sort=-committerdate` to determine the latest branch. I forgot,
though, that with backports it's possible for 0.35.0 to have a more
recent commit date than 0.36.0 (as is currently the case). This commit
updates the script to perform a numerical sort outside of git to get the
latest release branch.

Additionally this adds in some `set -ex` commands for the shell which
should help print out commands as they're run and assist in future
debugging.

[bad]: https://github.com/bytecodealliance/wasmtime/runs/6087188708

* Replace sed with rust
2022-04-20 13:31:38 -05:00
Bailey Hayes
76f7cde673 Add m1 to build matrix and release (#3983)
* Add m1 to release process

This will create a pre-compiled binary for m1 macs and adds
a link to review embark studios CI for verification.

* remove test for macos arm

Tests will not succeed for macos arm until GitHub provides a an m1 hosted runner.

Co-authored-by: Bailey Hayes <bhayes@singlestore.com>
2022-04-06 19:40:04 -05:00
Alex Crichton
35377bd33f Fixup release documentation (#3988)
* Fill out some missing comments on the workflow itself
* Fix some formatting in the book to properly render sub-bullets
2022-04-05 14:14:44 -05:00
Alex Crichton
c89dc55108 Add a two-week delay to Wasmtime's release process (#3955)
* Bump to 0.36.0

* Add a two-week delay to Wasmtime's release process

This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:

* In #3915 it was realized that changes just before the 0.35.0 release
  weren't enough for an embedding use case, but the PR didn't meet the
  expectations for a full patch release.

* At Fastly we were about to start rolling out a new version of Wasmtime
  when over the weekend the fuzz bug #3951 was found. This led to the
  desire internally to have a "must have been fuzzed for this long"
  period of time for Wasmtime changes which we felt were better
  reflected in the release process itself rather than something about
  Fastly's own integration with Wasmtime.

This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.

Overall there are some new properties about Wasmtime with this proposal
as well:

* The `main` branch will always have a section in `RELEASES.md` which is
  listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
  release. For example it will be bump pre-emptively as part of the
  release process on the 5th where if `release-2.0.0` was created then
  the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
  `RELEASES.md` notes.

The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.

* Add notes on a security patch

* Clarify security fixes shouldn't be previewed early on CI
2022-04-01 13:11:10 -05:00