WASI prototype design, implementation, and documentation.

This adds documents describing the WASI Core API, and an implementation in Wasmtime.
2019-03-27 08:00:00 -07:00
parent b0243b212f
commit b2fefe7714
53 changed files with 12801 additions and 23 deletions
--- a/docs/WASI-api.md
+++ b/docs/WASI-api.md
--- a/docs/WASI-background.md
+++ b/docs/WASI-background.md
@@ -0,0 +1,179 @@
+One of the biggest challenges in WebAssembly is figuring out what it's
+supposed to be.
+
+## A brief tangent on some related history
+
+The LLVM WebAssembly backend has gone down countless paths that it has
+ended up abandoning. One of the early questions was whether we should use
+an existing object file format, such as ELF, or design a new format.
+
+Using an existing format is very appealing. We'd be able to use existing
+tools, and be familiar to developers. It would even make porting some
+kinds of applications easier. And existing formats carry with them
+decades of "lessons learned" from many people in many settings, building,
+running, and porting real-world applications.
+
+The actual WebAssembly format that gets handed to platforms to run is
+its own format, but there'd be ways to make things work. To reuse existing
+linkers, we could have a post-processing tool which translates from the
+linker's existing output format into a runnable WebAssembly module. We
+actually made a fair amount of progress toward building this.
+
+But then, using ELF for example, we'd need to create a custom segment
+type (in the `PT_LOPROC`-`PT_HIPROC` range) instead of the standard
+`PT_LOAD` for loading code, because WebAssembly functions aren't actually
+loaded into the program address space. And same for the `PT_LOAD` for the
+data too, because especially once WebAssembly supports threads, memory
+initialization will need to
+[work differently](https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md#design).
+And we could omit the `PT_GNU_STACK`, because WebAssembly's stack can't
+be executable. And maybe we could omit `PT_PHDR` because unless
+we replicate the segment headers in data, they won't actually be
+accessible in memory. And so on.
+
+And while in theory everything can be done within the nominal ELF
+standard, in practice we'd have to make major changes to existing ELF
+tools to support this way of using ELF, which would defeat many of the
+advantages we were hoping to get. And we'd still be stuck with a custom
+post-processing step. And it'd be harder to optimize the system to
+take advantage of the unique features of WebAssembly, because everything
+would have to work within this external set of constraints.
+
+So while the LLVM WebAssembly backend started out trying to use ELF, we
+eventually decided to back out of that and design a
+[new format](https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md).
+
+## Now let's talk APIs
+
+It's apparent to anyone who's looked under the covers at Emscripten's interface
+between WebAssembly and the outside world that the current system is particular
+to the way Emscripten currently works, and not well suited for broader adoption.
+This is especially true as interest grows in running WebAssembly outside
+of browsers and outside of JS VMs.
+
+It's been obvious since WebAssembly was just getting started that it'd eventually
+want some kind of "system call"-like API, which could be standardized, and
+implemented in any general-purpose WebAssembly VM. 
+
+And while there are many existing systems we could model this after, [POSIX]
+stands out, as being a vendor-neutral standard with considerable momentum. Many
+people, including us, have been assuming that WebAssembly would eventually
+have some kind of POSIX API. Some people have even started experimenting with
+what
+[this](https://github.com/WAVM/Wavix/)
+[might](https://github.com/jfbastien/musl)
+[look](https://github.com/golang/go/blob/e5489cfc12a99f25331831055a79750bfa227943/misc/wasm/wasm_exec.js)
+[like](https://github.com/emscripten-core/emscripten/blob/incoming/src/library_syscall.js).
+
+But while a lot of things map fairly well, some things are less clear. One of
+the big questions is how to deal with the concept of a "process". POSIX's IPC
+mechanisms are designed around process, and in fact, the term "IPC" itself
+has "process" baked into it. The way we even think about what "IPC" means
+bakes in in understandings about what processes are and what communication
+between them looks like.
+
+Pipes, Unix-domain sockets, POSIX shared memory, signals, files with `fcntl`
+`F_SETLK`/`F_GETLK`-style locking (which is process-associated), are are tied
+to processes. But what *is* a process, when we're talking about WebAssembly?
+
+## Stick a fork in it
+
+Suppose we say that a WebAssembly instance is a "process", for the purposes
+of the POSIX API. This initially seems to work out well, but it leaves us
+with several holes to fill. Foremost is `fork`. `fork` is one of the pillars
+of Unix, but it's difficult to implement outside of a full Unix-style OS. We
+probably *can* make it work in all the places we want to run WebAssembly, but
+do we want to? It'd add a bunch of complexity, inefficiency, subtle behavioral
+differences, or realistically, a combination of all three.
+
+Ok, so maybe we can encourage applications to use `posix_spawn` instead. And
+some already do, but in doing so we do loose some of the value of POSIX's
+momentum. And even with `posix_spawn`, many applications will explicitly do
+things like `waidpid` on the resulting PID. We can make this work too, but
+we should also take a moment and step back to think about IPC in general.
+
+In WebAssembly, instances can synchronously call each other, and it can be
+very efficient. This is not something that typical processes can do. Arguably,
+a lot of what we now think of as "IPC" is just working around the inability
+of processes to have calls between each other. And, WebAssembly instances will
+be able to import each others' memories and tables, and eventually even pass
+around slices to their memories. In WebAssembly circles we don't even tend to
+think of these as IPC mechanisms, because the process metaphor just doesn't
+fit very well here. We're going to want applications to use these mechanisms,
+because they're efficient and take advantage of the platform, rather than
+using traditional Unix-style IPC which will often entail emulation and
+inefficiencies.
+
+Of course, there will always be a role for aiding porting of existing
+applications. Libraries that emulate various details of Unix semantics are
+valuable. But we can consider them tools for solving certain practical
+problems, rather than the primary interfaces of the system, because they
+miss out on some of the platform's fundamental features.
+
+## Mm-Mm Mmap
+
+Some of the fundamental assumptions of `mmap` are that there exists a
+relatively large virtual address space, and that unmapped pages don't
+occupy actual memory. The former doesn't tend to hold in WebAssembly,
+where linear address spaces tend to be only as big as necessary.
+
+For the latter, would it be possible to make a WebAssembly engine capable
+of unmapping pages in the middle of a linear memory region, and releasing
+the resources? Sure. Is this a programming technique we want WebAssembly
+programs doing in general, requiring all VMs to implement this?
+Probably not.
+
+What's emerging is a sense that what we want is a core set of
+APIs that can be implemented very broadly, and then optional API
+modules that VMs can opt into supporting if it makes sense for them.
+And with this mindset, `mmap` feels like it belongs in one of these
+optional sets, rather than in the core.
+
+(although note that even for the use case of reading files quickly,
+`mmap`
+[isn't always better than just reading into a buffer](https://blog.burntsushi.net/ripgrep/).
+
+## A WebAssembly port of Debian?
+
+This is a thought-experiment. Debian is ported to numerous hardware
+architectures. WebAssembly in some settings is presented as a hardware
+architecture. Would it make sense to port the Debian userspace to
+WebAssembly? What would this look like? What would it be useful for?
+
+It would be kind of cool to have a WebAssembly-powered Unix shell
+environment or even a graphical desktop environment running inside a
+browser. But would it be *really* cool? Significantly more cool than,
+say, an SSH or VNC session to an instance in the cloud? Because to do
+much with it, you'll want a filesystem, a network stack, and so on,
+and there's only so much that browsers will let you do.
+
+To be sure, it certainly would be cool. But there's a tendency in
+some circles to think of something like Debian as the natural end goal
+in a system API and toolchain for WebAssembly. We feel this tendency
+too ourselves. But it's never really been clear how it's supposed to
+work.
+
+The insight here is that we can split the design space, rather than
+trying to solve everything at once. We can have a core set of APIs
+that will be enough for most applications, but that doesn't try to
+support all of Debian userland. This will make implementations more
+portable, flexible, testable, and robust than if we tried to make
+every implementation support everything, or come up with custom
+subsets.
+
+As mentioned above, there is room for additional optional APIs to be
+added beyond the core WASI set. And there's absolutely a place for
+tools and libraries that features that aren't in the standard
+platform. So people interested in working on a Debian port can still
+have a path forward, but we don't need to let this become a focus for
+the core WASI design.
+
+## A picture emerges
+
+While much of what's written here seems relatively obvious in
+retrospect, this clarity is relatively new. We're now seeing many of the
+ideas which have been swirling around, some as old as WebAssembly
+itself, come together into a cohesive overall plan, which makes this
+an exciting time.
+
+[POSIX]: http://pubs.opengroup.org/onlinepubs/9699919799/
--- a/docs/WASI-capabilities.md
+++ b/docs/WASI-capabilities.md
@@ -0,0 +1,78 @@
+# Additional background on Capabilities
+
+## Unforgeable references
+
+One of the key words that describes capabilities is *unforgeable*.
+
+A pointer in C is forgeable, because untrusted code could cast an integer
+to a pointer, thus *forging* access to whatever that pointer value points
+to.
+
+MVP WebAssembly doesn't have unforgeable references, but what we can do instead
+is just use integer values which are indices into a table that's held outside
+the reach of untrusted code. The indices themselves are forgeable, but
+ultimately the table is the thing which holds the actual capabilities, and
+its elements are unforgeable. There's no way to gain access to a new resource
+by making up a new index.
+
+When the reference-types proposal lands, references will be unforgeable, and
+will likely subsume the current integer-based APIs, at the WASI API layer.
+
+## Static vs dynamic capabilities
+
+There are two levels of capabilities that we can describe: static and dynamic.
+
+The static capabilities of a wasm module are its imports. These essentially
+declare the set of "rights" the module itself will be able to request.
+An important caveat though is that this doesn't consider capabilities which
+may be passed into an instance at runtime.
+
+The dynamic capabilities of a wasm module are a set of boolean values
+associated with a file descriptor, indicating individual "rights". This
+includes things like the right to read, or to write, using a given file
+descriptor.
+
+## Filesystem rules
+
+It happens that integer indices representing capabilities is same thing that
+POSIX does, except that POSIX calls these indices *file descriptors*.
+
+One difference though is that POSIX normally allows processes to request
+a file descriptor for any file in the entire filesystem hierarchy, which is
+granted based on whatever security policies are in place. This doesn't
+violate the capability model, but it doesn't take full advantage of it.
+
+CloudABI, Fuchsia, and other capability-oriented systems prefer to take
+advantage of the hierarchical nature of the filesystem and require untrusted
+code to have a capability for a directory in order to access things inside
+that directory.
+
+So you can launch untrusted code, and at runtime give it access to specific
+directories, without having to set permissions in the filesystem or in
+per-application or per-user configuration settings.
+
+## Berkeley socket rules
+
+Sockets aren't naturally hierarchical though, so we'll need to decide what
+capabilities look like. This is an area that isn't yet implemented.
+
+In CloudABI, users launch programs with the sockets they need already
+created. That's a potentially startup point, which might be enough for
+simple cases.
+
+We also anticipate an eventual extension to that, where we create a capability
+that represents a set of possible sockets that can be created. A set
+might be described by ranges of permitted ports, ranges of permitted
+addresses, or sets of permitted protocols. In this case the actual socket
+wouldn't be created until the application actually requests it.
+
+## Other info
+
+CloudABI's intro to capability-based OS security provides additional background info:
+
+https://github.com/NuxiNL/cloudabi#capability-based-security
+
+
+The Fuchsia project has a blog post on the topic of capability-based OS security:
+
+https://fuchsia.googlesource.com/docs/+/HEAD/the-book/dotdot.md
--- a/docs/WASI-documents.md
+++ b/docs/WASI-documents.md
@@ -0,0 +1,22 @@
+# WASI Document Guide
+
+To get started using WASI, see [the intro document](WASI-intro.md) and
+[the tutorial](WASI-tutorial.md).
+
+For more detail on what WASI is, see [the overview](WASI-overview.md).
+
+For specifics on the API, see the [API documentation](https://github.com/CraneStation/wasmtime-wasi/blob/wasi/docs/WASI-api.md).
+Additionally, a C header file describing the WASI API is
+[here](https://github.com/CraneStation/reference-sysroot-wasi/blob/misc/libc-bottom-half/headers/public/wasi/core.h).
+
+For some discussion of capability-based design, see the [Capabilities document](WASI-capabilities.md).
+
+For some discussion of WASI's design inspiration, see the [Background document](WASI-background.md).
+
+For background on some of the design decisions in WASI, see [the rationale](WASI-rationale.md).
+
+For some ideas of things that we may want to change about WASI in the
+short term, see the [possible changes](WASI-some-possible-changes.md) document.
+For longer-term ideas, see the [possible future features](WASI-possible-future-features.md)
+document.
+
--- a/docs/WASI-intro.md
+++ b/docs/WASI-intro.md
@@ -0,0 +1,83 @@
+# Welcome to WASI!
+
+WASI stands for WebAssembly System Interface. It's an API designed by
+the [Wasmtime] project that provides access to several operating-system-like
+features, including files and filesystems, Berkeley sockets, clocks, and
+random numbers, that we'll be proposing for standardization.
+
+It's designed to be independent of browsers, so it doesn't depend on
+Web APIs or JS, and isn't limited by the need to be compatible with JS.
+And it has integrated capability-based security, so it extends
+WebAssembly's characteristic sandboxing to include I/O.
+
+See the [WASI Overview](WASI-overview.md) for more detailed background
+information, and the [WASI Tutorial](WASI-tutorial.md) for a walkthrough
+showing how various pieces fit together.
+
+Note that everything here is a prototype, and while a lot of stuff works,
+there are numerous missing features and some rough edges. One big thing
+that's not done yet is the actual mechanism to provide a directory as a
+pre-opened capability, to allow files to be opened. Some of the pieces
+are there (`__wasilibc_register_preopened_fd`) but they're not used yet.
+Networking support is also incomplete.
+
+## How can I write programs that use WASI?
+
+The two toolchains that currently work well are the Rust toolchain and
+a specially packaged C and C++ toolchain. Of course, we hope other
+toolchains will be able to implement WASI as well!
+
+### Rust
+
+To install a WASI-enabled Rust toolchain, follow the instructions here:
+
+https://github.com/alexcrichton/rust/releases/tag/wasi3
+
+Until now, Rust's WebAssembly support has had two main options, the
+Emscripten-based option, and the wasm32-unknown-unknown option. The latter
+option is lighter-weight, but only supports `no_std`. WASI enables a new
+wasm32-unknown-wasi target, which is similar to wasm32-unknown-unknown in
+that it doesn't depend on Emscripten, but it can use WASI to provide a
+decent subset of libstd.
+
+### C/C++
+
+All the parts needed to support wasm are included in upstream clang, lld, and
+compiler-rt, as of the LLVM 8.0 release. However, to use it, you'll need
+to build WebAssembly-targeted versions of the library parts, and it can
+be tricky to get all the CMake invocations lined up properly.
+
+To make things easier, we provide
+[prebuilt packages](https://github.com/CraneStation/wasi-sdk/releases)
+that provide builds of Clang and sysroot libraries.
+
+Note that C++ support has a notable
+[bug](https://bugs.llvm.org/show_bug.cgi?id=40412) in clang which affects
+<iostream> in libcxx. This will be fixed in future versions.
+
+## How can I run programs that use WASI?
+
+Currently the options are [Wasmtime] and the [browser polyfill], though we
+intend WASI to be implementable in many wasm VMs.
+
+[Wasmtime]: https://github.com/CraneStation/wasmtime
+[browser polyfill]: https://wasi.dev/polyfill/
+
+### Wasmtime
+
+[Wasmtime] is a non-Web WebAssembly engine which is part of the
+[CraneStation project](https://github.com/CraneStation/). To build
+it, download the code and build with `cargo build --release`. It can
+run WASI-using wasm programs by simply running `wasmtime foo.wasm`,
+or `cargo run --bin wasmtime foo.wasm`.
+
+### The browser polyfill
+
+The polyfill is online [here](https://wasi.dev/polyfill/).
+
+The source is [here](https://github.com/CraneStation/wasmtime-wasi/tree/wasi/lib/wasi/sandboxed-system-primitives/polyfill).
+
+## Where can I learn more?
+
+Beyond the [WASI Overview](WASI-overview.md), take a look at the
+various [WASI documents](WASI-documents.md).
--- a/docs/WASI-overview.md
+++ b/docs/WASI-overview.md
@@ -0,0 +1,163 @@
+# WASI: WebAssembly System Interface
+
+WebAssembly System Interface, or WASI, is a new family of API's being
+designed by the [Wasmtime] project to propose as a standard engine-independent
+non-Web system-oriented API for WebAssembly. Initially, the focus is on
+WASI Core, an API module that covers files, networking, and a few other
+things. Additional modules are expected to be added in the future.
+
+WebAssembly is designed to run well on the Web, however it's
+[not limited to the Web](https://github.com/WebAssembly/design/blob/master/NonWeb.md).
+The core WebAssembly language is independent of its surrounding
+environment, and WebAssembly interacts with the outside world
+exclusively through APIs. On the Web, it naturally uses the
+existing Web APIs provided by browsers. However outside of
+browsers, there's currently no standard set of APIs that
+WebAssembly programs can be written to. This makes it difficult to
+create truly portable non-Web WebAssembly programs.
+
+WASI is an initiative to fill this gap, with a clean set of APIs
+which can be implemented on multiple platforms by multiple engines,
+and which don't depend on browser functionality (although they
+still can run in browsers; see below).
+
+## Capability-Oriented
+
+The design follows
+[CloudABI](https://cloudabi.org/)'s
+(and in turn
+[Capsicum](https://www.cl.cam.ac.uk/research/security/capsicum/))'s concept of
+[capability-based security](https://en.wikipedia.org/wiki/Capability-based_security),
+which fits well into WebAssembly's sandbox model. Files,
+directories, network sockets, and other resources are identified
+by UNIX-like file descriptors, which are indices into external
+tables whose elements represent capabilities. Similar to how core
+WebAssembly provides no ability to access the outside world without
+calling imported functions, WASI APIs provide no ability to access
+the outside world without an associated capability.
+
+For example, instead of a typical
+[open](http://pubs.opengroup.org/onlinepubs/009695399/functions/open.html)
+system call, WASI provides an
+[openat](https://linux.die.net/man/2/openat)-like
+system call, requiring the calling process to have a file
+descriptor for a directory that contains the file, representing the
+capability to open files within that directory. (These ideas are
+common in capability-based systems.)
+
+However, the WASI libc implementation still does provide an
+implementation of open, by taking the approach of
+[libpreopen](https://github.com/musec/libpreopen).
+Programs may be granted capabilities for directories on launch, and
+the library maintains a mapping from their filesystem path to the
+file descriptor indices representing the associated capabilities.
+When a program calls open, they look up the file name in the map,
+and automatically supply the appropriate directory capability. It
+also means WASI doesn't require the use of CloudABI's `program_main`
+construct. This eases porting of existing applications without
+compromising the underlying capability model. See the diagram below
+for how libpreopen fits into the overall software architecture.
+
+WASI also automatically provides file descriptors for standard
+input and output, and WASI libc provides a normal `printf`. In
+general, WASI is aiming to support a fairly full-featured libc
+implementation, with the current implementation work being based on
+[musl](http://www.musl-libc.org/).
+
+## Portable System Interface for WebAssembly
+
+WASI is being designed from the ground up for WebAssembly, with
+sandboxing, portability, and API tidiness in mind, making natural
+use of WebAssembly features such as i64, import functions with
+descriptive names and typed arguments, and aiming to avoid being
+tied to a particular implementation.
+
+We'll often call functions in these APIs "syscalls", because they
+serve an analogous purpose to system calls in native executables.
+However, they're just functions that are provided by the
+surrounding environment that can do I/O on behalf of the program.
+
+WASI is starting with a basic POSIX-like set of syscall functions,
+though adapted to suit the needs of WebAssembly, such as in
+excluding functions such as fork and exec which aren't easily
+implementable in some of the places people want to run WebAssembly,
+and such as in adopting a capabilities-oriented design.
+
+And, as WebAssembly grows support for
+[host bindings](https://github.com/webassembly/host-bindings)
+and related features, capabilities can evolve to being represented
+as opaque, unforgeable
+[reference typed values](https://github.com/WebAssembly/reference-types),
+which can allow for finer-grained control over capabilities, and
+make the API more accessible beyond the C-like languages that
+POSIX-style APIs are typically aimed at.
+
+## WASI Software Architecture
+
+To facilitate use of the WASI API, a libc
+implementation called WASI libc is being developed, which presents
+a relatively normal musl-based libc interface, implemented on top
+of a libpreopen-like layer and a system call wrapper layer (derived
+from the "bottom half" of
+[cloudlibc](https://github.com/NuxiNL/cloudlibc)).
+The system call wrapper layer makes calls to the actual WASI
+implementation, which may map these calls to whatever the
+surrounding environment provides, whether it's native OS resources,
+JS runtime resources, or something else entirely.
+
+[This libc is part of a "sysroot"](https://github.com/WebAssembly/reference-sysroot),
+which is a directory containing compiled libraries and C/C++ header
+files providing standard library and related facilities laid out in
+a standard way to allow compilers to use it directly.
+
+With the [LLVM 8.0](http://llvm.org/)
+release, the WebAssembly backend is now officially stable, but LLVM
+itself doesn't provide a libc - a standard C library, which you
+need to build anything with clang. This is what the WASI-enabled
+sysroot provides, so the combination of clang in LLVM 8.0 and the
+new WASI-enabled sysroot provides usable Rust and C compilation
+environments that can produce wasm modules that can be run in
+[Wasmtime] with WASI support, in browsers with the WASI polyfill,
+and in the future other engines as well.
+
+![WASI software architecture diagram](wasi-software-architecture.png "WASI software architecture diagram")
+
+## Future Evolution
+
+The first version of WASI is relatively simple, small, and
+POSIX-like in order to make it easy for implementers to prototype
+it and port existing code to it, making it a good way to start
+building momentum and allow us to start getting feedback based on
+experience.
+
+Future versions will change based on experience
+and feedback with the first version, and add features to address
+new use cases. They may also see significant architectural
+changes. One possibility is that this API could
+evolve into something like
+[Fuchsia](https://en.wikipedia.org/wiki/Google_Fuchsia)'s
+low-level APIs, which are more complex and abstract, though also
+more capable.
+
+We also expect that whatever WASI evolves into in the future, it
+should be possible to implement this initial API as a library
+on top.
+
+## Can WASI apps run on the Web?
+
+Yes! We have a polyfill which implements WASI and runs in browsers.
+At the WebAssembly level, WASI is just a set of callable functions that
+can be imported by a .wasm module, and these imports can be implemented
+in a variety of ways, including by a JavaScript polyfill library running
+within browsers.
+
+And in the future, it's possible that
+[builtin modules](https://github.com/tc39/ecma262/issues/395)
+could take these ideas even further allowing easier and tighter
+integration between .wasm modules importing WASI and the Web.
+
+## Work in Progress
+
+WASI is currently experimental. Feedback is welcome!
+
+[Wasmtime]: https://github.com/CraneStation/wasmtime
--- a/docs/WASI-possible-future-features.md
+++ b/docs/WASI-possible-future-features.md
@@ -0,0 +1,49 @@
+# Possible Future Features
+
+This are features we're interested in, but don't have yet, and which will require
+some amount of design work.
+
+## File Locking
+
+POSIX's answer is `fcntl` with `F_SETLK`/`F_GETLK`/etc., which provide advisory
+record locking. Unfortunately, these locks are associated with processes, which
+means that if two parts of a program independently open a file and try to lock
+it, if they're in the same process, they automatically share the lock.
+
+Other locking APIs exist on various platforms, but none is widely standardized.
+
+POSIX `F_SETLK`-style locking is used by SQLite.
+
+## File change monitoring
+
+POSIX has no performant way to monitor many files or directories for changes.
+
+Many popular operating systems have system-specific APIs to do this though, so
+it'd be desirable to come up with a portable API to provide access to this
+functionality.
+
+## Scalable event-based I/O
+
+POSIX's `select` and `poll` have the property that each time they're called,
+the implementation has to scan through all the file descriptors to report if any
+of them has I/O ready, which is inefficient when there are large numbers of
+open files or sockets.
+
+Many popular operating systems have system-specific APIs that provide
+alternative ways to monitor large numbers of I/O streams though, so it'd be
+desirable to come up with a portable API to provide access to this
+functionality.
+
+## Crash recovery
+
+POSIX doesn't have clear guidance on what applications can expect their
+data will look like if the system crashes or the storage device is otherwise
+taken offline abruptly.
+
+We have `fsync` and `fdatasync`, but even these have been a topic of
+[much discussion].
+
+[much discussion]: https://wiki.postgresql.org/wiki/Fsync_Errors
+
+Also, currently WASI's docs don't make any guarantees about things like
+`path_rename` being atomic.
--- a/docs/WASI-rationale.md
+++ b/docs/WASI-rationale.md
@@ -0,0 +1,160 @@
+## Why not a more traditional set of POSIX-like syscalls?
+
+In related work, the LLVM wasm backend started out trying to use ELF object
+files for wasm, to be as conventional as possible. But wasm doesn't fit into
+ELF in some very fundamental ways. Code isn't in the address space, callers
+have to know their callee's exact signatures, imports and exports don't have
+ELF semantics, function pointers require tables to be populated, index 0 is
+valid in some contexts where it isn't in ELF, and so on. It ultimately got
+to the point where the work we were considering doing to *emulate* ELF
+interfaces to make existing tools happy looked like more than the work that
+would be required to just build new tools.
+
+The analogy isn't perfect, but there are some parallels to what we're now
+figuring out about system calls. Many people, including us, had initially
+assumed that at least some parts of the wasm ecosystem would eventually
+standardize on a basic map of POSIX-like or Linux-like system calls into wasm
+imports. However, this turns out to be more complex than it initially seems.
+
+One of WebAssembly's unique attributes is the ability to run sandboxed
+without relying on OS process boundaries. Requiring a 1-to-1 correspondence
+between wasm instances and heavyweight OS processes would take away this key
+advantage for many use cases. Fork/exec are the obvious example of an API
+that's difficult to implement well if you don't have POSIX-style processes,
+but a lot of other things in POSIX are tied to processes too. So it isn't
+a simple matter to take POSIX, or even a simple subset of it, to WebAssembly.
+
+We should note that Spectre concerns are relevant here, though for now we'll
+just observe that actual security depends on the details of implementations
+and use cases, and it's not necessarily a show-stopper.
+
+Another area where WebAssembly differs from traditional POSIX-like platforms
+is in its Capability-oriented approach to security. WebAssembly core has no
+ability to address the outside world, except through interacting with
+imports/exports. And when reference types are added, they'll be able to
+represent very fine-grained and dynamic capabilities.
+
+A capability-oriented system interface fits naturally into WebAssembly's
+existing sandbox model, by extending the simple story that a wasm module
+can't do anything until given capabilities. There are ways to sandbox
+traditional OS filesystem APIs too, but in a multiple-implementation
+ecosystem where the methods for setting up path filtering will likely
+differ between implementations, designing the platform around capabilities
+will make it easier for people to consistently configure the capabilities
+available to wasm modules.
+
+This is where we see WASI heading.
+
+## Why not non-blocking?
+
+This is an open question. We're using blocking APIs for now because that's
+*by far* the simpler way to get the overall system to a usable state, on
+both the wasm runtime side and the toolchain side. But one can make an
+argument that non-blocking APIs would have various advantages, so we
+look forward to discussing this topic with the WebAssembly CG subgroup
+once it's set up.
+
+## Why not async?
+
+We have some ideas about how the current API could be extended to be async.
+In particular, we can imagine making a distinction between WebAssembly
+programs which are *Commands* and those which we'll call *Reactors*.
+Commands have a `main` function which is called once, and when `main`
+exits, the program is complete. Reactors have a setup function, but
+once that completes, the instance remains live and is called from callbacks.
+In a Reactor, there's an event loop which lives outside of the nominal
+program.
+
+With this distinction, we may be able to say things like:
+ - In a Reactor, WASI APIs are available, but all functions have an
+   additional argument, which specifies a function to call as a continuation
+   once the I/O completes. This way, we can use the same conceptual APIs,
+   but adapt them to run in an callback-based async environment.
+ - In a Command, WASI APIs don't have callback parameters. Whether or not
+   they're non-blocking is an open question (see the previous question).
+
+Reactors might then be able to run in browsers on the main thread,
+while Commands in browsers might be limited to running in Workers.
+
+## Why no mmap and friends?
+
+True mmap support is something that could be added in the future,
+though it is expected to require integration with the core language.
+See "Finer-grained control over memory" in WebAssembly's
+[Future Features] document for an overview.
+
+Ignoring the many non-standard mmap extensions out there,
+the core mmap behavior is not portable in several respects, even
+across POSIX-style systems. See
+[LevelDB's decision to stop using mmap], for one example in
+practice, and search for the word "unspecified" in the
+[POSIX mmap spec] for some others.
+
+And, some features of mmap can lead to userspace triggering
+signals. Accessing memory beyond the end of the file, including in
+the case where someone else changes the size of the file, leads to a
+`SIGBUS` on POSIX-style systems. Protection modes other than
+`PROT_READ|PROT_WRITE` can produce `SIGSEGV`. While some VMs are
+prepared to catch such signals transparently, this is a burdensome
+requirement for others.
+
+Another issue is that while WASI is a synchronous I/O API today,
+this design may change in the future. `mmap` can create situations
+where doing a load can entail blocking I/O, which can make it
+harder to characterize all the places where blocking I/O may occur.
+
+And lastly, WebAssembly linear memory doesn't support the semantics
+of mapping and unmapping pages. Most WebAssembly VMs would not
+easily be able to support freeing the memory of a page in the middle
+of a linear memory region, for example.
+
+To make things easier for people porting programs that just use
+mmap to read and write files in a simple way, WASI libc includes a
+minimal userspace emulation of `mmap` and `munmap`.
+
+[POSIX mmap spec]: http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html
+[LevelDB's decision to stop using mmap]: https://groups.google.com/forum/#!topic/leveldb/C5Hh__JfdrQ
+[Future Features]: https://webassembly.org/docs/future-features/.
+
+## Why no UNIX-domain sockets?
+
+UNIX-domain sockets can communicate three things:
+ - bytes
+ - file descriptors
+ - user credentials
+
+The concept of "users" doesn't fit within WASI, because many implementations
+won't be multi-user in that way.
+
+It can be useful to pass file descriptor between wasm instances, however in
+wasm this can be done by passing them as arguments in plain function calls,
+which is much simpler and quicker. And, in WASI implementations where file
+descriptors don't correspond to an underlying Unix file descriptor concept,
+it's not feasible to do this if the other side of the socket isn't a
+cooperating WebAssembly engine.
+
+We may eventually want to introduce a concept of a WASI-domain socket, for
+bidirectional byte-oriented local communication.
+
+## Why no dup?
+
+The main use cases for `dup` are setting up the classic Unix dance of setting
+up file descriptors in advance of performing a `fork`. Since WASI has no `fork`,
+these don't apply.
+
+And avoiding `dup` for now avoids committing to the POSIX concepts of
+descriptors being distinct from file descriptions in subtle ways.
+
+## Why are `path_remove_directory` and `path_unlink_file` separate syscalls?
+
+In POSIX, there's a single `unlinkat` function, which has a flag word,
+and with the `AT_REMOVEDIR` flag one can specify whether one wishes to
+remove a file or a directory. However, there really are two distinct
+functions being performed here, and having one system call that can
+select between two different behaviors doesn't simplify the actual API
+compared to just having two system calls.
+
+More importantly, in WASI, system call imports represent a static list
+of the capabilities requested by a wasm module. Therefore, WASI prefers
+each system call to do just one thing, so that it's clear what a wasm
+module that imports it might be able to do with it.
--- a/docs/WASI-some-possible-changes.md
+++ b/docs/WASI-some-possible-changes.md
@@ -0,0 +1,114 @@
+# Possible changes
+
+The following are a list of relatively straightforward changes
+to WASI core that should be considered.
+
+## Split file/networking/random/clock from args/environ/exit.
+
+Currently everything is mixed together in one big "core" module. But we can
+split them out to allow minimal configurations that don't support this style
+of files and networking.
+
+## Move higher-level and unused errno codes out of the core API.
+
+The core API currently defines errno codes such as `EDOM` which are
+not used for anything. POSIX requires them to be defined, however
+that can be done in the higher-level libraries, rather than in the
+WASI core API itself.
+
+## Detecting EOF from read/recv explicitly.
+
+POSIX's `read` returns 0 if and only if it reaches the end of a file or stream.
+
+Say you have a read buffer of 1024 bytes, and are reading a file that happens
+to be 7 bytes long. The first `read` call will return 7, but unless you happen
+to know how big the file is supposed to be, you can't distinguish between
+that being all there is, and `read` getting interrupted and returning less
+data than you requested.
+
+Many applications today do an extra `read` when they encounter the end of a
+file, to ensure that they get a `read` that returns 0 bytes read, to confirm
+that they've reached the end of the file. If `read` instead had a way to
+indicate that it had reached the end, this extra call wouldn't be necessary.
+
+And, `read` on a socket is almost equivalent to `recv` with no flags -- except for
+one surprising special case: on a datagram socket, if there's a zero-length
+datagram, `read` can't consume it, while `recv` can. This is because `read` can't
+indicate that it successfully read 0 bytes, because it has overloaded the meaning
+of 0 to indicate eof-of-file.
+
+So, it would be tidier from multiple perspectives if `read` could indicate
+that it had reached the end of a file or stream, independently of how many
+bytes it has read.
+
+## Merging read and recv
+
+These are very similar, and differ only in subtle ways. It'd make the API
+easier to understand if they were unified.
+
+## Trap instead of returning EFAULT
+
+POSIX system calls return EFAULT when given invalid pointers, however from an
+application perspective, it'd be more natural for them to just segfault.
+
+## More detailed capability error reporting
+
+Replace `__WASI_ENOTCAPABLE` with error codes that indicate *which* capabilities
+were required but not present.
+
+## Split `__wasi_path_open` into `__wasi_path_open_file` and `__wasi_path_open_directory`?
+
+We could also split `__WASI_RIGHT_PATH_OPEN` into file vs directory,
+(obviating `__WASI_O_DIRECTORY`).
+
+## Fix the y2556 bug
+
+In some places, timestamps are measured in nanoseconds since the UNIX epoch,
+so our calculations indicate a 64-bit counter will overflow on
+Sunday, July 21, 2554, at 11:34:33 pm UTC.
+
+These timestamps aren't used in that many places, so it wouldn't cost that
+much to widen these timestamps. We can either just extend the current type to
+128 bits (two i64's in wasm) or move to a `timespec`-like `tv_sec`/`tv_nsec`
+pair.
+
+## Remove `fd_allocate`?
+
+Darwin doesn't implement `fd_allocate`, despite it being a in POSIX
+since 2001. So we don't currently know any way to implement `fd_allocate`
+on Darwin that's safe from race conditions. Should we remove it from the API?
+
+## Redesign `fstflags_t`
+
+The relationship between `*_SET_*TIM` and `*_SET_*TIM_NOW` is non-obvious.
+We should look at this again.
+
+## readdir
+
+Truncating entries that don't fit into a buffer may be error-prone. Should
+we redesign how directory reading works?
+
+## symlinks
+
+Symlinks are fairly UNIX-specific. Should we remove `__wasi_path_symlink`
+and `__wasi_path_readlink`?
+
+Also, symlink resolution doesn't benefit from libpreopen-style path
+translation. Should we move symlink resolution into the libpreopen layer
+and do it entirely in "userspace"?
+
+## Remove the `path_len` argument from `__wasi_fd_prestat_dir_name`
+
+The buffer should be sized to the length returned from `__wasi_fd_prestat_get`,
+so it's not necessary to pass the length back into the runtime.
+
+## Add a `__wasi_path_filestat_set_size` function?
+
+Along with libc/libpreopen support, this would enable implementing the
+POSIX `truncate` function.
+
+## errno values returned by `path_open`
+
+We should specify the errno value returned when `path_open` is told
+to open a directory and `__WASI_LOOKUP_SYMLINK_FOLLOW` isn't set, and
+the path refers to a symbolic link.
--- a/docs/WASI-tutorial.md
+++ b/docs/WASI-tutorial.md
@@ -0,0 +1,159 @@
+# WASI tutorial
+
+Let's start with a simple C program which performs a file copy, which will
+show to compile and run programs, as well as perform simple sandbox
+configuration. The C code here uses standard POSIX APIs, and doesn't have
+any knowledge of WASI, WebAssembly, or sandboxing.
+
+```c
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <errno.h>
+
+int
+main(int argc, char **argv) {
+    int n, m;
+    char buf[BUFSIZ];
+
+    if (argc != 3) {
+        fprintf(stderr, "usage: %s <from> <to>\n", argv[0]);
+        exit(1);
+    }
+
+    int in = open(argv[1], O_RDONLY);
+    if (in < 0) {
+        fprintf(stderr, "error opening input %s: %s\n", argv[1], strerror(errno));
+        exit(1);
+    }
+
+    int out = open(argv[2], O_WRONLY | O_CREAT, 0660);
+    if (out < 0) {
+        fprintf(stderr, "error opening output %s: %s\n", argv[2], strerror(errno));
+        exit(1);
+    }
+
+    while ((n = read(in, buf, BUFSIZ)) > 0) {
+        while (n > 0) {
+            m = write(out, buf, n);
+            if (m < 0) {
+                fprintf(stderr, "write error: %s\n", strerror(errno));
+                exit(1);
+            }
+            n -= m;
+        }
+    }
+
+    if (n < 0) {
+        fprintf(stderr, "read error: %s\n", strerror(errno));
+        exit(1);
+    }
+
+    return EXIT_SUCCESS;
+}
+```
+
+We'll put this source in a file called `demo.c`.
+
+The [wasi-sdk](https://github.com/CraneStation/wasi-sdk/releases) provides a clang
+which is configured to target WASI and use the WASI sysroot by default, so we can
+compile our program like so:
+
+```
+$ clang demo.c
+```
+
+A few things to note here. First, this is just regular clang, configured to use
+a WebAssembly target and sysroot. The name `a.out` is the traditional default
+output name that C compilers use, and can be overridden with the "-o" flag in the
+usual way. And, the output of clang here is a standard WebAssembly module:
+
+```
+$ file a.out
+a.out: WebAssembly (wasm) binary module version 0x1 (MVP)
+```
+
+It's a single file containing a self-contained wasm module, that doesn't require
+any supporting JS code.
+
+We can execute it with wasmtime directly, like so:
+
+```
+$ wasmtime a.out
+usage: a.out <from> <to>
+```
+
+Ok, this program needs some command-line arguments. So let's give it some:
+
+```
+$ echo hello world > test.txt
+$ wasmtime a.out test.txt /tmp/somewhere.txt
+error opening input test.txt: Capabilities insufficient
+```
+
+Aha, now we're seeing the sandboxing in action. This program is attempting to
+access a file by the name of `test.txt`, however it hasn't been given the
+capability to do so.
+
+So let's give it capabilities to access files in the requisite directories:
+
+```
+$ wasmtime --dir=. --dir=/tmp a.out test.txt /tmp/somewhere.txt
+$ cat /tmp/somewhere.txt
+hello world
+```
+
+Now our program runs as expected!
+
+As a brief aside, note that we used the path `.` above to grant the program
+access to the current directory. This is needed because the mapping from
+paths to associated capabilities is performed by libc, so it's part of the
+WebAssembly program, and we don't expose the actual current working
+directory to the WebAssembly program. So providing a full path doesn't work:
+
+```
+$ wasmtime --dir=$PWD --dir=/tmp a.out test.txt /tmp/somewhere.txt
+$ cat /tmp/somewhere.txt
+error opening input test.txt: Capabilities insufficient
+```
+
+So, we always have to use `.` to refer to the current directory.
+
+Speaking of `.`, what about `..`? Does that give programs a way to break
+out of the sandbox? Let's see:
+
+```
+$ wasmtime --dir=. --dir=/tmp a.out test.txt /tmp/../etc/passwd
+$ cat /tmp/somewhere.txt
+error opening output /tmp/../etc/passwd: Capabilities insufficient
+```
+
+The sandbox says no. And note that this is the capabilities system saying no
+here ("Capabilities insufficient"), rather than Unix access controls
+("Permission denied"). Even if the user running wasmtime had write access to
+`/etc/passwd`, WASI programs don't have the capability to access files outside
+of the directories they've been granted. This is true when resolving symbolic
+links as well.
+
+Wasmtime also has the ability to remap directories, with the `--mapdir`
+command-line option:
+
+```
+$ wasmtime --dir=. --mapdir=/tmp:/var/tmp a.out test.txt /tmp/somewhere.txt
+$ cat /var/tmp/somewhere.txt
+hello world
+```
+
+This maps the name `/tmp` within the WebAssembly program to `/var/tmp` in the
+host filesystem. So the WebAssembly program itself never sees the `/var/tmp` path,
+but that's where the output file goes.
+
+See [here](WASI-capabilities.md) for more information on the capability-based
+security model.
+
+The capability model is very powerful, and what's shown here is just the beginning.
+In the future, we'll be exposing much more functionality, including finer-grained
+capabilities, capabilities for network ports, and the ability for applications to
+explicitly request capabilities.