Files
wasmtime/ci/build-tarballs.sh
Alex Crichton 1efee4abdf Update CI to use GitHub's Merge Queue (#5766)
GitHub recently made its merge queue feature available for use in public
repositories owned by organizations meaning that the Wasmtime repository
is a candidate for using this. GitHub's Merge Queue feature is a system
that's similar to Rust's bors integration where PRs are tested before
merging and only passing PRs are merged. This implements the "not rocket
science" rule where the `main` branch of Wasmtime, for example, is
always tested and passes CI. This is in contrast to our current
implementation of CI where PRs are merged when they pass their own CI,
but the code that was tested is not guaranteed to be the state of `main`
when the PR is merged, meaning that we're at risk now of a failing
`main` branch despite all merged PRs being green. While this has
happened with Wasmtime this is not a common occurrence, however.

The main motivation, instead, to use GitHub's Merge Queue feature is
that it will enable Wasmtime to greatly reduce the amount of CI running
on PRs themselves. Currently the full test suite runs on every push to
every PR, meaning that our workers on GitHub Actions are frequently
clogged throughout weekdays and PRs can take quite some time to come
back with a successful run. Through the use of a Merge Queue, however,
we're able to configure only a small handful of checks to run on PRs
while deferring the main body of checks to happening on the
merge-via-the-queue itself. This is hoped to free up capacity on CI and
overall improve CI times for Wasmtime and Cranelift developers.

The implementation of all of this required quite a lot of plumbing and
retooling of our CI. I've been testing this in an [external
repository][testrepo] and I think everything is working now. A list of
changes made in this PR are:

* The `build.yml` workflow is merged back into the `main.yml` workflow
  as the original reason to split it out is not longer applicable (it'll
  run on all merges). This was also done to fit in the dependency graph
  of jobs of one workflow.

* Publication of the `gh-pages` branch, the `dev` tag artifacts, and
  release artifacts have been moved to a separate
  `publish-artifacts.yml` workflow. This workflow runs on all pushes to
  `main` and all tags. This workflow no longer actually preforms any
  builds, however, and relies on a merge queue or similar being used for
  branches/tags where artifacts are downloaded from the workflow run to
  be uploaded. For pushes to `main` this works because a merge queue is
  run meaning that by the time the push happens all artifacts are ready.
  For release branches this is handled by..

* The `push-tag.yml` workflow is subsumed by the `main.yml` workflow. CI
  for a tag being pushed will upload artifacts to a release in GitHub,
  meaning that all builds must finish first for the commit. The
  `main.yml` workflow at the end now scans commits for the preexisting
  magical marker and pushes a tag if necessary.

* CI is currently a flat list of "run all these jobs" and this is now
  rearchitected to a "fan out" approach where some jobs run to determine
  the next jobs to run which then get "joined" into a finish step. The
  purpose for this is somewhat nuanced and this has implications for CI
  runtime as well. The Merge Queue feature requires branches to be
  protected with "these checks must pass" and then the same checks are
  gates both to enter the merge queue as well as pass the merge queue.
  The saving grace, however, is that a "skipped" check counts as
  passing, meaning checks can be skipped on PRs but run to completion on
  the merge queue. A problem with this though is the build matrix used
  for tests where PRs want to only run one element of the build matrix
  ideally but there's no means on GitHub Actions right now for the
  skipped entries to show up as skipped easily (or not that I know of).
  This means that the "join" step serves the purpose of being the single
  gate for both PR and merge queue CI and there's just more inputs to it
  for merge queue CI. The major consequence of this decision is that
  GitHub's actions scheduling doesn't work out well here. Jobs are
  scheduled in a FIFO order meaning that the job for "ok complete the CI
  run" is queued up after everything else has completed, possibly
  after lots of other CI requests in the middle for other PRs. The hope
  here is that by using a merge queue we can keep CI relatively under
  control and this won't affect merge times too much.

* All jobs in the `main.yml` workflow will not automatically cancel the
  entire run if they fail. Previously this fail-fast behavior was only
  part of the matrix runs (and just for that matrix), but this is
  required to make the merge queue expedient. The gate of the merge
  queue is the final "join" step which is only executed once all
  dependencies have finished. This means, for example, that if rustfmt
  fails quickly then the tests which take longer might run for quite
  awhile before the join step reports failure, meaning that the PR sits
  in the queue for longer than needed being tested when we know it's
  already going to fail. By having all jobs cancel the run this means
  that failures immediately bail out and mark the whole job as
  cancelled.

* A new "determine" CI job was added to determine what CI actually needs
  to run. This is a "choke point" which is scheduled at the start of CI
  that quickly figures out what else needs to be run. This notably
  indicates whether large swaths of ci (the `run-full` flag) like the
  build matrix are executed. Additionally this dynamically calculates a
  matrix of tests to run based on a new `./ci/build-test-matrix.js`
  script. Various inputs are considered for this such as:

  1. All pushes, meaning merge queue branches or release-branch merges,
     will run full CI.
  2. PRs to release branches will run full CI.
  3. PRs to `main`, the most common, determine what to run based on
     what's modified and what's in the commit message.

  Some examples for (3) above are if modifications are made to
  `cranelift/codegen/src/isa/*` then that corresponding builder is
  executed on CI. If the `crates/c-api` directory is modified then the
  CMake-based tests are run on PRs but are otherwise skipped.
  Annotations in commit messages such as `prtest:*` can be used to
  explicitly request testing.

Before this PR merges to `main` would perform two full runs of CI: one
on the PR itself and one on the merge to `main`. Note that the one as a
merge to `main` was quite frequently cancelled due to a merge happening
later. Additionally before this PR there was always the risk of a bad
merge where what was merged ended up creating a `main` that failed CI to
to a non-code-related merge conflict.

After this PR merges to `main` will perform one full run of CI, the one
as part of the merge queue. PRs themselves will perform one test job
most of the time otherwise. The `main` branch is additionally always
guaranteed to pass tests via the merge queue feature.

For release branches, before this PR merges would perform two full
builds - one for the PR and one for the merge. A third build was then
required for the release tag itself. This is now cut down to two full
builds, one for the PR and one for the merge. The reason for this is
that the merge queue feature currently can't be used for our
wildcard-based `release-*` branch protections. It is now possible,
however, to turn on required CI checks for the `release-*` branch PRs so
we can at least have a "hit the button and forget" strategy for merging
PRs now.

Note that this change to CI is not without its risks. The Merge Queue
feature is still in beta and is quite new for GitHub. One bug that
Trevor and I uncovered is that if a PR is being tested in the merge
queue and a contributor pushes to their PR then the PR isn't removed
from the merge queue but is instead merged when CI is successful, losing
the changes that the contributor pushed (what's merged is what was
tested). We suspect that GitHub will fix this, however.

Additionally though there's the risk that this may increase merge time
for PRs to Wasmtime in practice. The Merge Queue feature has the ability
to "batch" PRs together for a merge but this is only done if concurrent
builds are allowed. This means that if 5 PRs are batched together then 5
separate merges would be created for the stack of 5 PRs. If the CI for
all 5 merged together passes then everything is merged, otherwise a PR
is kicked out. We can't easily do this, however, since a major purpose
for the merge queue for us would be to cut down on usage of CI builders
meaning the max concurrency would be set to 1 meaning that only one PR
at a time will be merged. This means PRs may sit in the queue for awhile
since previously many `main`-based builds are cancelled due to
subsequent merges of other PRs, but now they must all run to 100%
completion.

[testrepo]: https://github.com/bytecodealliance/wasmtime-merge-queue-testing
2023-02-16 19:18:42 +00:00

90 lines
3.2 KiB
Bash
Executable File

#!/bin/bash
# A small script used for assembling release tarballs for both the `wasmtime`
# binary and the C API. This is executed with two arguments, mostly coming from
# the CI matrix.
#
# * The first argument is the name of the platform, used to name the release
# * The second argument is the "target", if present, currently only for
# cross-compiles
#
# This expects the build to already be done and will assemble release artifacts
# in `dist/`
set -ex
platform=$1
target=$2
rm -rf tmp
mkdir tmp
mkdir -p dist
tag=dev
if [[ $GITHUB_REF == refs/heads/release-* ]]; then
tag=v${GITHUB_REF:19}
fi
bin_pkgname=wasmtime-$tag-$platform
api_pkgname=wasmtime-$tag-$platform-c-api
mkdir tmp/$api_pkgname
mkdir tmp/$api_pkgname/lib
mkdir tmp/$api_pkgname/include
mkdir tmp/$bin_pkgname
cp LICENSE README.md tmp/$api_pkgname
cp LICENSE README.md tmp/$bin_pkgname
cp -r crates/c-api/include tmp/$api_pkgname
cp crates/c-api/wasm-c-api/include/wasm.h tmp/$api_pkgname/include
fmt=tar
if [ "$platform" = "x86_64-windows" ]; then
cp target/release/wasmtime.exe tmp/$bin_pkgname
cp target/release/{wasmtime.dll,wasmtime.lib,wasmtime.dll.lib} tmp/$api_pkgname/lib
fmt=zip
# Generate a `*.msi` installer for Windows as well
export WT_VERSION=`cat Cargo.toml | sed -n 's/^version = "\([^"]*\)".*/\1/p'`
"$WIX/bin/candle" -arch x64 -out target/wasmtime.wixobj ci/wasmtime.wxs
"$WIX/bin/light" -out dist/$bin_pkgname.msi target/wasmtime.wixobj -ext WixUtilExtension
rm dist/$bin_pkgname.wixpdb
elif [ "$platform" = "x86_64-mingw" ]; then
cp target/x86_64-pc-windows-gnu/release/wasmtime.exe tmp/$bin_pkgname
cp target/x86_64-pc-windows-gnu/release/{wasmtime.dll,libwasmtime.a,libwasmtime.dll.a} tmp/$api_pkgname/lib
fmt=zip
elif [ "$platform" = "x86_64-macos" ]; then
# Postprocess the macOS dylib a bit to have a more reasonable `LC_ID_DYLIB`
# directive than the default one that comes out of the linker when typically
# doing `cargo build`. For more info see #984
install_name_tool -id "@rpath/libwasmtime.dylib" target/release/libwasmtime.dylib
cp target/release/wasmtime tmp/$bin_pkgname
cp target/release/libwasmtime.{a,dylib} tmp/$api_pkgname/lib
elif [ "$platform" = "aarch64-macos" ]; then
install_name_tool -id "@rpath/libwasmtime.dylib" target/aarch64-apple-darwin/release/libwasmtime.dylib
cp target/aarch64-apple-darwin/release/wasmtime tmp/$bin_pkgname
cp target/aarch64-apple-darwin/release/libwasmtime.{a,dylib} tmp/$api_pkgname/lib
elif [ "$target" = "" ]; then
cp target/release/wasmtime tmp/$bin_pkgname
cp target/release/libwasmtime.{a,so} tmp/$api_pkgname/lib
else
cp target/$target/release/wasmtime tmp/$bin_pkgname
cp target/$target/release/libwasmtime.{a,so} tmp/$api_pkgname/lib
fi
mktarball() {
dir=$1
if [ "$fmt" = "tar" ]; then
# this is a bit wonky, but the goal is to use `xz` with threaded compression
# to ideally get better performance with the `-T0` flag.
tar -cvf - -C tmp $dir | xz -9 -T0 > dist/$dir.tar.xz
else
# Note that this runs on Windows, and it looks like GitHub Actions doesn't
# have a `zip` tool there, so we use something else
(cd tmp && 7z a ../dist/$dir.zip $dir/)
fi
}
mktarball $api_pkgname
mktarball $bin_pkgname