WebAssembly, bytecode for browser
This joint project between browser vendors has defined an intermediate language for the Web, with asm.js as a starting point.
Presented by Brendan Eich, creator of JavaScript and partly of Asm.js, WebAssembly or Wasm, is a new intermediate representation language (IRL) compatible with all browsers. It will allow different high-level languages to run in browsers being compiled to Wasm. The initial goal is to compile programs and libraries from C and C ++ to WebAssembly. Then other languages will follow.
In June 2015, the development has just begun. But it is facilitated by the use of Asm.js as a starting point. This subset of JavaScript is compatible with all browsers.
The goal was to maintain compatibility between Asm.js and wasm for a few years. Not only for the support in browsers, but also because qu'Asm.js is also in development, and new features which will be include into it will be indispensable also to wasm. Eg threads with shared memory, support for multi-core processors.
Later, when all browsers will support the new WebAssembly language, both languages may begin to differ. The second will be changed without modifying Asm.js, so JavaScript.
Wasm in particular receives the support of Google with the involvement of the NaCl team, so we can therefore anticipate the abandonment of the latter whose goal was close enough: to run native applications in the browser. Furthermore the V8 team announced the integration of wasm in the JavaScript compiler. So the wasm engine can produce IR code to be interpreted by the JS JIT engine.
Experiments have shown that parsing time of the new code (not the speed of execution) is 20 time faster than that of Asm.js. The size of the binary code generated is three times more compact than in Asm.js.
According to Brendan Eich, wasm is in fact a "compressed AST encoding" and not a true bytecode. AST for abstract syntax tree.
Some other advantage of wasm:
- It will be possible to use modules written in wasm in JavaScript programs (as is the case for asm.js). The ES6 module system will be used.
- It will have a garbage collector and DOM access.
- Two notations of the code are possible: binary for executing or text for reading. Like an assembly language - but it is not one.
- A wasm LLVM backend is proposed initially. We can generate wasm code with an option from C/C++. It will then be available for other languages.
- A polyfill: a JavaScript program that converts wasm code to asm.js for older browsers that do not support the first.
And a disadvantage:
- It does not yet support the compilation of dynamic languages (like JavaScript).
Although compiled languages in WebAssembly become a possible alternative to JavaScript, it will remain supported and privileged by browsers. Wasm will be a part of the JS compiler.
Why a new intermediate language?
Why not just use LLVM, or the Java bytecode, or .NET?
The primary objective of wasm is to be able to run source code written in C and C ++ in web applications. An immense amount of functions has been written in these languages, being able to reuse them with comparable execution time is really appreciable. 3D games are mostly written in C++: they become portable with wasm.
Wasm wants to be to the applications in the browser what Vulkan is to OpenGL and DirectX: a universal and portable intermediate code with a speed of execution close to native.
LLVM
The LLVM bitcode was not retained because it is not portable. It contains metadata and is designed to produce executable binaries, not to work directly on all systems.
Moreover Google has already tried to use LLVM in the browser with NaCL and this has not been adopted by other publishers.
However, we use Emscriptem to compile the intermediate language of LLVM in wasm, so we keep the benefit of its tool chain.
.NET
The .NET runtime has an intermediate code that can run with a virtual machine or be compiled. But it is not suitable for C and C++ languages. Microsoft uses C++/CLI which has difference with native C++.
JVM
The Java virtual machine is not an option either because, if many languages have been ported on this platform, this is not the case for C and C ++. Without even counting Oracle's propensity to sue for a reason as futile as the use of its API. The entire Web could be faced with the army of surly lawyers of the firm, which would be disastrous. WebAssembly will allow you to develop in peace.
Quotes from Brendan Eich about WebAssembly
The continued evolution of ASM.js is wasm.
At first, WebAssembly starts out just like ASM.js, but with a compressed syntax, that’s a binary syntax. But once all the browsers support both wasm and ASM.js, and after a decent interval of browser updates, then wasm can start to grow extra semantics that need not be put into JavaScript.
There are lots of languages you might compile to wasm.
Assuming stasis on the web — it’s not a good assumption, I think that was the mistake that happened long ago with projects like Portable Native Client and Dart, too
WebAssembly vs Java
The creators of the language saw much further than the browser when they defined its specification. In fact it is divided into two layers:
- The core of the language, which defines the syntax and instructions independently of any environment. It is a compact language easy to integrate.
- APIs to define interfaces between wasm and different environments. The first was made for JavaScript.
The language itself can be used in all kinds of platforms: on the Web, on the desktop, and on any operating system. It depends on the APIs that will be added to the language and the developers are currently multiplying them. There is even an operating system, Nebulet, where everything is written in wasm and compiled in binary.
The WASI project aims to create a standard interface for all operating systems so that the same wasm binary code can be compiled once and used everywhere. We can then compare WebAssembly to Java. The first has the advantage of operating in isolation with its own memory, so allow safer applications. There is also no shadow of Oracle and its restrictions on "intellectual property".
Another advantage over Java is that being designed to work even on the mini processors of IoT, the runtime is much lighter than that of Java. However, it can be used on the desktop and server where it will make containers like docker useless.
How to create and use WebAssembly code
You have to install emsdk and wasmer or wasmtime to make wasm programs.
You can then compile a C or C++ program to wasm with this command:
emcc demo.cpp -o demo.wasm
and run it with this command:
wasmtime demo.wasm
A full tutorial is provided heree: Developer's guide to WebAssembly or on webassembly.fr.
Tools to generate wasm code and run it are available on GitHub.
- binaryen. Script to run wasm code on the command line.
- asm2wasm. Compile asm.js to WebAssembly.
- wasm2asm. Compile WebAssembly to asm.js.
- s2wasm.Compile the code specially produced by LLVM in WebAssembly. LLVM has its own format for this.
- wasm.js. A JavaScript version of binaryen: when included in a web page, it runs the wasm code even if the browser does not support it yet.
- Emscriptem. Compiles C and C ++ to asm.js and with BINARYEN option compiles then asm.js to wasm.
- Ilwasm. Converts a subset of the intermediate .NET code in wasm.
- Lucet is a compiler on Linux for producing local applications from C code.
- InNative. Wasm compiler to make stand-alone executables on Windows or Linux. Compatible with C.
References
- WebAssembly Community Group. At the W3C, the group dedicated to the new language is open to new participants. Currently they are the team of V8 and NaCl of Google, the Emscriptem team from Mozilla, Microsoft and Apple's languages and runtim team.
- From Asm.js to WebAssembly. Presentation by Brendan Eich.
- Why we need WebAssembly. Interview of Brendan Eich.
- FAQ of wasm. On GitHub. See especially "Why not use LLVM bitcode".
- WebAssembly design. On GitHub, all details about wasm.
- Future Features. What the language will become.
- Which programming language for WebAssembly?