Zero Day Initiative — Seeing Double: Exploiting a Blind Spot in MemGC

Seeing Double: Exploiting a Blind Spot in MemGC

December 17, 2018 | Simon Zuckerbraun

This is the first in our series of Top 5 interesting cases from 2018. Each of these bugs has some element that sets them apart from the approximately 1,400 advisories released by the program this year. We begin with a Pwn2Own winner exploiting Microsoft Edge in a way that shouldn’t be possible.

At Pwn2Own 2018, Richard Zhu (fluorescence) successfully compromised several targets to claim the title of Master of Pwn. One of his targets was Microsoft Edge, which he dispatched using an exploit chain including two Use-After-Free (UAF) vulnerabilities. One of those UAF vulnerabilities is so remarkable that it qualifies as one of our top five bugs of the year, which we are detailing in this series of blog posts. The identifier for this vulnerability is CVE-2018-8179.

Let’s dive right into some proof-of-concept code for this vulnerability, and see what makes it so astonishing:

Figure 1 - Annotated PoC

Figure 1 shows the proof-of-concept together with some annotations indicating the order of operations. The main action begins in step 3 with the invocation of setRemoteCandidates. This API expects a JavaScript array. As noted in Figure 1, things go awry during iteration through this array. Upon access of arr2[1], the getter method executes (steps 4-5), and this script is able to release the memory used by the object that arr2[0] originally referred to. The script then reclaims and overwrites the memory with attacker-specified data (step 5). The crash occurs when setRemoteCandidates continues executing and attempts to access the object originally referred to by arr2[0].

To complete this picture, we will need to understand a bit more of what happens under the hood when setRemoteCandidates processes its parameter. Its operation is as follows:

      1 - Create an internal array structure called a CModernArray<>.
      2 - Iterate over arr2. For each element, get a pointer to the element and add it to the CModernArray<>.
      3 - Iterate over the CModernArray<> and process each JavaScript object in turn.

CModernArray<> is a C++ class defined in edgehtml.dll. Crucially, it stores its data in a buffer allocated from the MemGC heap. To recap and summarize the operation of this proof-of-concept: edgehtml.dll iterates over arr2. In doing so, it first copies arr2[0] to a MemGC-controlled buffer belonging to a CModernArray<>. Then, when accessing arr2[1], the JavaScript object that was already copied from arr2[0] gets freed and reclaimed, even though an outstanding pointer is present in the CModernArray<>. The crash occurs when retrieving this pointer from index 0 of the CModernArray<>.

Taking all this information together, we can now appreciate how incredible it is that there is a vulnerability here at all. Throughout the process, all objects involved (JavaScript arrays, etc.) are allocated on the MemGC heap. Furthermore, all pointers to those objects are stored within MemGC-allocated arrays and buffers. How, then, could a UAF arise? A UAF of this sort is precisely what MemGC is supposed to prevent. MemGC is designed to be aware of all pointers present within the MemGC heap, so that no MemGC allocation can be freed while a pointer still exists. Within the realm of MemGC heap allocations, MemGC is supposed to be all-knowing and all-seeing. Why, then, does MemGC not detect the fact that there is an outstanding pointer present in the CModernArray<>?

Why, indeed.

I will now let you in on a dread secret.

There is no such thing as “the MemGC heap”.

There are two MemGC heaps.

And they don’t have visibility into one another’s allocations.

The two MemGC heaps arise as follows. One MemGC heap is used internally by Chakra, which is the browser’s JavaScript engine. All heap-based JavaScript objects, as well as many internal Chakra data structures, are stored on this heap. We will call this the “Chakra heap”. The other MemGC heap is provided by chakra.dll for the use of external consumers, most notably edgehtml.dll. We will call this the “DOM heap”, and it supersedes the older MemoryProtection mechanism that was first introduced in Internet Explorer in July of 2014. The DOM heap is used for all DOM objects as well as most other heap allocations performed from edgehtml.dll.

These two heaps share an implementation, but they are represented by two separate instances of the class chakra!Memory::Recycler. When garbage collection occurs, during the “mark” phase, the recycler scans all live heap allocations, as well as the stack and processor registers, for pointers to additional heap allocations so those can also be marked as live. Since not every value in memory is a bona fide pointer value, this scan will turn up some extraneous results. To help filter these out, Recycler will automatically reject any value that is outside the range of allocations belonging to the heap. However, this determination is only made on a per-heap basis. A Recycler instance has no knowledge of the address regions that may be in use by other Recycler instances. Furthermore, it has no ability to place “marks” on MemGC allocations belonging to other Recycler instances.

As an aside, there are even more than two MemGC heaps. Each thread of JavaScript execution has its own Chakra heap instance. Generally, this does not cause trouble because a JavaScript object does not interact with code on any thread except the one on which it was created.

We can now understand what occurs when running the proof-of-concept code in Figure 1. In step 5 (see figure), memory pressure forces garbage collection for the Chakra heap. This is performed by the Recycler instance associated with the Chakra heap. While scanning the stack, the recycler encounters the pointer to the CModernArray<> buffer. However, it immediately rejects this pointer, because the CModernArray<> buffer has been allocated on the DOM heap and not the Chakra heap. As a result, the Chakra heap’s Recycler never scans the contents of the CModernArray<> buffer, and consequently, it misses the outstanding pointer to the Chakra-heap-allocated object contained therein.

Conclusion

We have shown that the concept of an all-knowing, all-seeing MemGC is something of a misconception. While MemGC is a great success as a mitigation, it is not entirely without shortcomings.

It is also instructive to consider the patch. Now, before adding each object to the CModernArray<>, edgehtml makes a call to chakra::JsVarAddRef to explicitly pin the object in memory. See edgehtml!ORTC::UnpackArrayObjectVar. MemGC falls on its sword in this case, and the code in edgehtml must resort to manual object-lifetime management.

You can find me on Twitter at @HexKitchen, and follow the team for the latest in exploit techniques and security patches. Stay tuned for the next Top 5 bug blog, which will be released tomorrow.