Last month, i was at REcon Montreal to give my training about WebAssembly Security and after some discussion people always ask me this question: Is WebAssembly already used in the wild?
The answer is of course YES and some WebAssembly modules are potentially running right now in your browser if you are using Google web services. Recently, Google was using WebAssembly for the beta version of Google Earth but also in production for services like Google Keep.
In this blogpost, we gonna reverse partially the WebAssembly module loaded by Google Keep, determine its purpose and extract a maximum of information for future complete analysis. Let’s Go!
Just a quick reminder before we start, if you are interested about WebAssembly security, I decided to convert my 4-day live training into recorded courses.
More details about my courses here.
Usually, in order to run a WebAssembly module in your web page, you will fetch a wasm file and instantiate the module using dedicated JavaScript API. Once it’s done, you will be able to call the module exported functions directly from JavaScript.
Regarding the Google Keep web app, the WebAssembly module “ink.wasm” is fetch (image below – left) and instantiated by the minified JavaScript file ink-loader.js. (image below – right). Based on JS functions names, this JavaScript file seems to has been generated automatically by emscripten.
One of the first step when reversing a WebAssembly module is to convert the wasm binary (.wasm) to his text format (.wat/.wast) representation. wasm2wat is the perfect tool for this job.
wasm2wat ink.wasm -o ink.wat
The output file (ink.wat) is a text file with around 1.5 Millions of lines.Based on minified imported & exported function names (image – right), we can confirm that the module has been compiled by emscripten and the optimization flag (-O3)
This module contains a Data section and the content of this section will be used to initialized the linear memory i.e. an ArrayBuffer shared between the module and the loader script (ink-loader.js).
At the beginning of this module Data section, we get a lot of details about how the module has been built:
At this point, I first tried to retrieve the source code of the WebAssembly module by searching the project path on the web. I found the repository of Chromium 66 (66.0.3359.158 ~1 year old) but without C/C++ source code inside. On the master branch, there is no reference of sketchology anymore but we get information about what is Ink. Finally, the github repository (https://github.com/google/ink) return a 404 error.
After some research, Sketchology refers to an IOS application called “Sketchology Review”. This application isn’t available anymore, the twitter account is inactive and the official website (sketchologyapp.com) is down but you can find a copy using the WayBack Machine. On the LinkedIn’s profile of the creator, we can find that “Sketchology is the first vector drawing app with realtime natural media brush effects like blur or watercolor.”
On the other hand, “Ink is a software library enabling Google applications to let their users express themselves using freehand drawing and handwriting”. This library is also used in Google Canvas released end of 2018 (source).
So, it seems that Ink is the evolution/successor of Sketchology and Google Keep use this module ink.wasm when the user want to draw a note (image on top).
To verify our hypothesis, you can debug the WebAssembly module and set breakpoints using the Developer console. In the image below, my breakpoint was triggered when i tried to create a new drawing note.
Still inside the module data section, you will find multiple chunk of Google Protobuf encoded blobs (image on the top).
“Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.” – source
Those chunk of bytes can be reversed/deserialized using tools such as protobuf-inspector (image at the bottom). Source code of the more generic protocolbuffer file can be found directly on the github repository of the protobuf project (like descriptor.proto)
This kind of information is particularly useful if your are doing pentesting/vulnerability research on the server-side web API.
Another part of the data section contains complete piece of codes (bottom image) with variables and main functions. This code is a WebGL “Vertex shader structure” and it will be loaded by WebGL building shader functions at runtime.
Finally, we reach the last part of this module data section that is for me the most interesting one. Inside you will find more than 5 thousands strings like:
Just with those strings, we can reconstruct the project tree (image on the left) and associate the corresponding error messages, mangling names and constants for each file.
If you want to reverse completely this module, you will need first to match the previous information (WebGL, debug strings, …) with memory accesses/offsets (image on the top).
Then, you can determine the functions prototype (mangling names + arguments) and associate each WebAssembly functions with C++ source files. Finally, you can try to decompile your new labeled module into C code using tool like wasm2c.
Nevertheless in this blogpost, we have at the end:
All the files (wasm, js) and extracted information are available in this github repository.
If you want to learn about WebAssembly security from module reversing to WebAssembly VM vulnerability research, I decided to convert my 4-day live training into recorded courses.
More details about my courses here.