Looking at Patch Gap Vulnerabilities in the VMware ESXi TCP/IP Stack
July 27, 2022 | Reno RobertOver the last few years, multiple VMware ESXi remote, unauthenticated code execution vulnerabilities have been publicly disclosed. Some were also found to be exploited in the wild. Since these bugs were found in ESXi’s implementation of the SLP service, VMware provided workarounds to turn off the service. VMware also disabled the service by default starting with ESX 7.0 Update 2c. In this blog post, we explore another remotely reachable attack surface: ESXi’s TCP/IP stack implemented as a VMkernel module. The most interesting outcome of this analysis is that ESXi’s TCP/IP stack is based on FreeBSD 8.2 and does not include security patches for the vulnerabilities disclosed over the years since that release of FreeBSD.
This result also prompted us to analyze the nature of vulnerabilities disclosed in other open-source components used by VMware, such as OpenSLP and ISC-DHCP. Once again, we observed that most of the disclosed vulnerabilities had upstream patches before the disclosure.
The FreeBSD Code in the ESXi TCP/IP Stack
An initial analysis was done to locate the VMkernel module implementing the TCP/IP stack. In ESXi, the module details can be enumerated using esxcfg-info
or esxcli system module
commands as seen below:
Checking the function names and strings available in the module, it is understood that the code is coming from FreeBSD, as shown in the following:
While the exact version of FreeBSD was not initially known, comparing the binary against the patches mentioned in advisories revealed multiple missing fixes. I performed a manual diff of FreeBSD branches 8.x, 9.x, and 10.x against the VMkernel module to narrow down the version of code used. Consider a couple of MFC (Merge From CURRENT) commits for example: 213158 and 215207. The 213158 commit was merged to FreeBSD 8.2 and later, whereas 215207 was merged to FreeBSD 8.3 and later. While 213158 was found in the VMkernel module, the 215207 commit was missing. This gave an approximate timeline of when the code was forked, which is around FreeBSD 8.2.
Porting Type Information from FreeBSD to the ESXi VMkernel Module
Once the version of the FreeBSD source is known, we can port type information from FreeBSD 8.2 to the VMkernel module. The goal here was to make function prototype and structure information available in IDA. Since creating type libraries for multiple dependent headers can be complex, I decided to rely on the FreeBSD kernel binary compiled from source with DWARF information. My first choice was to use IDA Pro’s Dump typeinfo to IDC feature. However, there were still missing structures because only assembler-level types were exported to the IDC.
The workaround for the problem is to use the TILIB idea by @rolfrolles. Using the above-mentioned technique, it is possible to create a type library from the FreeBSD 8.2 kernel binary and then load it into the VMkernel module’s IDB. This imports all the C-level types required for further analysis with the decompiler. While the Structure and Local Types view does not list all imported type information, we can apply the function prototypes as detailed in a previous blog post.
As a recap, begin by extracting the function prototype from the FreeBSD kernel as a key value pair of function_name:function_type
, then iterate through the extracted type information and apply it to the VMkernel module having symbols.
Once the prototypes are applied, the Structure and Local Types views are automatically populated. Any other types required for local variables can be further added manually using the workflow mentioned in Importing types into IDB.
Analyzing VMkernel Debug Information from VMware Workbench
VMware Workbench is an Eclipse IDE plugin provided by VMware. Amongst many other features, it also provides debug symbols for VMkernel. Jan Harrie and Matthias Luft have documented the usage of the VMkernel debug information in their research entitled Attacking VMware NSX [Slides 34 – 37 in the PDF]. The UI of the Eclipse plugin is shown below:
Kernel modules in ESXi communicate with VMkernel using VMK APIs. By using a VMkernel build that has debug information, it becomes easier to understand the VMK API calls made by any VMkernel module, which includes the TCP/IP stack. Moreover, the type information from VMkernel can be ported to any desired VMkernel module or even to the Virtual Machine Monitor. Here is an example of type information ported to the TCP/IP module found in ESXi 6.7 Update 3 using debug information from VMkernel version 14320388:
As soon as the function prototypes are applied to the binary, IDA helpfully propagates the type information to many of the variables found in the code. When working with code forked from FreeBSD, it is useful to place the Hex-Rays decompilation side-by-side with known FreeBSD source code to rename variables and add type information to variables that IDA did not detect automatically . However, in this case, we do not have access to local variable names or types. While some variables are given meaningful names by IDA, the rest were manually added using context information from the detected enums, structure names, and so forth.
VMware seems to have stopped publishing VMkernel debug information for more than a year now. The last public release for which I could find the debug information goes back to ESXi 6.7 16773714.
Missing FreeBSD Patches
At this point, we had enough information to perform a diff of the TCP/IP VMkernel module against the FreeBSD security advisories. Patches provided in the advisories were compared against ESXi’s version of the module to identify the missing bug fixes. Based on static analysis, patches for 16 vulnerabilities fixed in FreeBSD were found to be missing in ESXi. We reported these missing patches to VMware for further evaluation. Shown below is the set of missing patches that we reported to VMware and their corresponding applicability to ESXi as evaluated by VMware:
CVE | Disclosure Date | Bug | Applicable |
CVE-2013-3077 | 8/22/13 | Integer overflow in IP_MSFILTER | Yes |
CVE-2013-5691 | 9/10/13 | Insufficient credential checks in network IOCTL | No |
CVE-2004-0230 | 9/16/14 | Denial of Service in TCP packet processing | Yes |
CVE-2015-1414 | 2/25/15 | Integer overflow in IGMP protocol | Yes |
CVE-2015-2923 | 4/7/15 | Denial of Service with IPv6 Router Advertisements | Yes |
CVE-2015-5358 | 7/21/15 | Resource exhaustion due to sessions stuck in LAST_ACK state | Yes |
CVE-2015-1417 | 7/28/15 | Resource exhaustion in TCP reassembly | No |
CVE-2018-6916 | 3/7/18 | IPSec validation and use-after-free | Yes |
CVE-2018-6918 | 4/4/18 | IPSec crash or denial of service | Yes |
CVE-2018-6922 | 8/6/18 | Resource exhaustion in TCP reassembly | No |
CVE-2018-6923 | 8/14/18 | Resource exhaustion in IP fragment reassembly | No |
CVE-2019-5608 | 8/6/19 | ICMPv6 / MLDv2 out-of-bounds memory access | Yes |
CVE-2019-5611 | 8/20/19 | IPv6 remote Denial-of-Service | Yes |
CVE-2020-7451 | 3/19/20 | TCP IPv6 SYN cache kernel information disclosure | Yes |
CVE-2020-7457 | 7/8/20 | IPv6 socket option race condition and use after free | Yes |
CVE-2020-7469 | 12/1/20 | ICMPv6 use-after-free in error message handling | Yes |
These issues were addressed in ESXi 7.0 Update 3f as well as ESXi 6.5 and ESXi 6.7.
Analysis of OpenSLP in ESXi
The next component considered for analysis is the SLPD service in ESXi. ESXi’s SLP service is a fork of OpenSLP version 1.0.0. The version information and build configuration details can be fetched as below:
Just like the VMkernel modules, the SLPD service executable has symbols but not type information. Using the available function names, we observed that VMware’s forked code is not entirely based on version 1.0.0. It also has some of the functions added later, including some from version 2.0.0. There are also differences in some of the structure definitions as well as function prototypes. These observations can be made as soon as you port type information from OpenSLP and certain functions won’t decompile cleanly.
To fetch the type information, build OpenSLP from source and then use IDA Pro’s Create C header + Parse C header feature to populate the local types. The C header parses without much error. This is a quick workaround compared to creating type libraries from multiple dependent headers. Note that ESXi versions 7.0 and above run the SLPD service as a 64-bit executable. Because of this, care must be taken while using the created C header interchangeably between ESXi 7.0 and other versions. This is especially true as the header may require some modifications on data sizes, such as size_t being unsigned int vs unsigned long long.
Timeline of Upstream Patches vs Disclosures:
In the case of OpenSLP, there is enough public information regarding the vulnerabilities affecting ESXi and their relevant upstream patches, most notable being the one published by Lucas Leong on CVE-2020-3992 and CVE-2021-21974. For the scope of this blog, we will consider the following set of remote code execution vulnerabilities:
CVE | VMSA | Bug |
CVE-2015-5177 | VMSA-2015-0007 | Double Free |
CVE-2019-5544 | VMSA-2019-0022 | Heap Overflow |
CVE-2020-3992 | VMSA-2020-0023 | Use-After-Free |
CVE-2021-21974 | VMSA-2021-0002 | Heap Overflow |
CVE-2015-5177 is a double-free vulnerability that occurs when processing a message buffer across functions: SLPDProcessMessage, ProcessDAAdvert, and SLPDKnownDAAdd. An upstream patch to fix the issue in SLPDKnownDAAdd() was already available.
Similarly, CVE-2020-3992 is a use-after-free vulnerability, again found in the handling of message buffer across functions SLPDProcessMessage, ProcessDAAdvert, and SLPDKnownDAAdd. However, this vulnerability does not affect the upstream version of OpenSLP. This bug can be traced back to CVE-2015-5177 based on the information provided in VMSA.
VMware released updates in the form of the VIB (vSphere Installation Bundle). Installing the VIBs one by one on ESXi to perform patch diffing of SLPD binary can be a tedious process. Instead, binaries can be directly extracted from the VIB to speed up the process. The relevant VIBs can be downloaded from the ESXi Patch Tracker, which lists them based on the timeline of release. Consider a diff between ESXi 6.7 build 8169922 and 8941472:
To replicate this, download the esx-base VIB to extract the slpd binary using the vmtar utility. The vmtar utility can be copied from ESXi and executed in Ubuntu since it does not have any ESXi-specific dependencies:
Similarly, in the case of the ESXi 6.7 GA ISO image, extract the S.V00 file from VMware-VMvisor-Installer-6.7.0-8169922.x86_64.iso:
A patch diff between slpd binaries confirms that CVE-2020-3992 was introduced when patching CVE-2015-5177 in ESXi versions 6.0 and above. The fix for CVE-2015-5177, in this case, was different from that of the working upstream fix in SLPDKnownDAAdd():
CVE-2020-3992 was in fact fixed twice (VMSA-2020-0023.1) since the initial patch did not completely address the issue. Additionally, this bug was also reported to be exploited in the wild by the Cybersecurity and Infrastructure Security Agency (CISA).
CVE-2021-21974, a heap overflow in SLPParseSrvUrl() leading to RCE, also had an upstream patch. This fix was part of a large code change in OpenSLP.
CVE-2019-5544 was the only bug that affected both the upstream and the VMware fork of OpenSLP at the time of disclosure. Though the bug has already been disclosed and fixed by many distros, the patch for this bug is still missing in the OpenSLP GitHub repository. This was another bug exploited in the wild.
Putting it all together, here is the timeline of disclosure vs availability of patches:
Analysis of ISC-DHCP in Workstation
The next component considered for analysis is the DHCP service of VMware Workstation, which is based on ISC-DHCP version 2.0. You can refer to the blog post Wandering through the Shady Corners of VMware Workstation/Fusion for an earlier analysis of this attack surface. It is noted that, though the codebase is a couple of decades old, VMware has backported fixes for known vulnerabilities. Keeping this in mind, only the recently disclosed bugs, CVE-2019-5540 and CVE-2020-3947, were analyzed.
Unlike earlier analysis where the executables had symbols, the vmnet-dhcp
is completely stripped. The first task is to identify as many function names as possible to carry out the analysis. Looking for binaries with symbols led us to the very first release – VMware 1.0 Build 160. The vmnet-dhcpd
binary from the first release of VMware had symbols and type information in stabs format.
For matching the available symbols with recent releases of the Workstation DHCP server, I relied on the updated version of Rizzo published by Quarkslab. Since Rizzo can be very slow on large binaries, only the string reference signatures were used in detection. With this, Rizzo identified around 69 functions using 226 matching strings available in VMware 1.0 Build 160 vmnet-dhcpd
binary. A somewhat similar result can be achieved using the strings in ISC-DHCP 2.0. For anyone interested, the historical releases of ISC-DHCP can be found here.
During the patch diffing process, bug fixes in vmnet-dhcpd
are easy to spot since the binary undergoes very limited changes per release. The first bug, CVE-2019-5540, is an information disclosure vulnerability due to an uninitialized field in the ICMP packet structure. This issue can be narrowed down to an upstream patch in the function icmp_echorequest().
The next bug, CVE-2020-3947, is a use-after-free vulnerability when handling the DHCPRELEASE packet. During the handling of a DHCPDISCOVER packet from a client, ISC-DHCP allocates memory in the heap. If the client identifier is long, it stores the resultant pointer as part of the lease
structure. Otherwise, uid_buf
is used for storage. The relevant code can be found in the ack_lease
function of the server/dhcp.c source file:
Later when a DHCPRELEASE packet is received, the lease
structure is adjusted. First, a copy of lease
is made in the function release_lease
and then supersede_lease
is called. This code can be found in common/memory.c:
Note that the uid
pointer is now found in both the old and the newly copied lease structures when it is passed to supersede_lease
. In supersede_lease
, the comp->uid
pointer is freed (provided that it points to memory other than uid_buf
, that is, in the case of a long client identifier):
Later, lease->uid
entry is reused. See the code below assigning comp -> uid = lease -> uid
, which by this time points to freed memory:
Finally, the comp
structure that has a pointer to the freed uid
buffer is passed to write_lease
where the use-after-free happens. The bug can be triggered by adding a long dhcp-client-identifier
entry to the DHCP client configuration and then initiating a DHCP request followed by a DHCP release using dhclient
:
This issue was fixed in the release_lease function as part of a bigger code change by introducing a lease_copy
function. The copied lease structure no longer holds a reference to the uid
buffer of the old lease. Instead, a new allocation is requested and assigned. The release_lease
function received further changes during later updates for handling the failover protocol. VMware’s fix for this bug is different from that of the upstream, possibly due to the divergence of the codebase.
Here is the timeline of disclosure vs availability of patches:
The IDA Python scripts used for the analysis can be found here.
Conclusion
Over the last few months, we analyzed three different VMware components: the ESXi networking stack, the ESXi SLPD service, and the Workstation DHCP server. The three components share the fact that they are forks of open-source code. Unlike open-source packages that can be updated to a new version with relative ease, updating a fork takes significantly more effort. This is because various challenges arise when cherry-picking patch commits.
In the ESXi FreeBSD networking stack, even the vulnerabilities with known CVE identifiers remained unfixed – the forked code was never tracked for vulnerabilities. Better tracking of FreeBSD disclosures is a good place to start and can help resolve some of the issues. Also, considering the criticality of the component, it might be possible for VMware to work with FreeBSD Security Officers to get prior information on vulnerabilities as per their Information handling policies:
“The FreeBSD Security Officer has close working relationships with a number of other organizations, including third-party vendors that share code with FreeBSD…..Frequently vulnerabilities may extend beyond the scope of the FreeBSD implementation, and (perhaps less frequently) may have broad implications for the global networking community. Under such circumstances, the Security Officer may wish to disclose vulnerability information to these other organizations.”
In most cases, vendor advisories only list the supported versions of the affected software. Even if a legacy version is not listed in the advisory, the patch might still be applicable for the forked code.
In the case of OpenSLP and ISC-DHCP forks, the patch gap vulnerabilities come from bugs that do not have CVE identifiers. Some of the bug fixes have clear commit messages about the issue, whereas other fixes are done as part of large code changes or refactoring. Either way, it is a challenging process and requires a trained set of eyes to pick up security patches when the vulnerabilities are not explicitly called out. This also shows the significance of requesting CVE identifiers for security fixes since companies largely rely on them for cherry-picking patches.
As we showed, a couple of ESXi SLP patches had further issues despite the availability of working upstream fixes. To avoid this, it is important to evaluate a bug report not just against the fork but also against the upstream project. If a fix is found, developers should evaluate the upstream fix. Upstream fixes in most cases are a safe bet compared to that of custom patches, which may be written without having a comprehensive understanding of the code. However, this process of adopting patches can be further complicated if the codebases have diverged considerably.
The most important concern with forked code is the lifetime of bugs. When open-source code is used and updated as a package to the latest version, all the bug fixes are applied (i.e., vulnerabilities with or without CVE identifiers). In a forked codebase, when a patch is missed during cherry-picking, it will remain unfixed until the code is updated as a whole or someone finds and reports the issue. For example, the ISC-DHCP bugs were fixed in VMware’s DHCP server a couple of decades after the upstream fixes were available. In fact, the ISC-DHCP uninitialized memory information disclosure was fixed upstream even before the launch of CVE. Overall, forked code requires more attention compared to that of packages.
Considering the challenges of adopting security fixes and gaps in the tracking process, exploitable bugs might be still lying around without a fix. There are different types of patch gaps as well, so expect to hear more about these in the future. Until then, you can find me on Twitter @RenoRobertr, and follow the team for the latest in exploit techniques and security patches.