CVE-2020-0729: Remote Code Execution Through .LNK Files
March 26, 2020 | Trend Micro Research TeamIn this excerpt of a Trend Micro Vulnerability Research Service vulnerability report, John Simpson and Pengsu Cheng of the Trend Micro Research Team detail a recent remote code execution bug in Microsoft Windows .LNK files. The following is a portion of their write-up covering CVE-2020-0729, with a few minimal modifications.
For Microsoft’s February 2020 Patch Tuesday, the company released security fixes for a mind-boggling 99 CVEs, a rather large number to fix in a single month. While a lot of attention has been paid to the Scripting Engine vulnerability that has been actively exploited, another vulnerability rated as critical really stood out. CVE-2020-0729 has been deemed a remote code execution vulnerability involving Windows LNK files, also known as shortcut files. Part of what makes this vulnerability so compelling is that historically, exploits for vulnerabilities in LNK files have been used to spread malware such as the famed Stuxnet and, in the vast majority of cases, simply browsing to a folder containing a malicious LNK file, whether local or on a network share, is sufficient to trigger the vulnerability. The question then becomes, does this vulnerability have the same potential for exploitation as some of the past LNK vulnerabilities? Due to the fact that LNK files are a binary file format that only has documentation for a few top level structures, answering this question requires a lot of digging.
Beginning the Analysis
For Microsoft’s Patch Tuesday, it is standard procedure for our research team to begin analyzing a vulnerability by unpacking the “security only” patch bundle for a given Windows platform and, based on the information from Microsoft’s advisory, attempt to locate files in the patch that are likely associated with the vulnerability. The February patch bundle did not contain updates to any of the usual DLLs typically associated with processing LNK files, such as shell32.dll and windows.storage.dll, leaving us scratching our heads as to where the problem might lie. However, after a closer look through the list of files, one particular DLL stood out to us: StructuredQuery.dll. Part of the reason this stood out is that we have seen vulnerabilities explicitly named as involving StructuredQuery in the past, such as CVE-2018-0825, but in this particular patch Tuesday no such advisory exists. So, what’s the connection between LNK files and StructuredQuery? A quick search for StructuredQuery on Microsoft’s Windows Dev Center leads us to documentation for the structuredquery.h header, which tells us that it is used by Windows Search and this is precisely where LNK files and StructuredQuery connect.
The Many Talents of LNK Files
LNK files are mostly known for containing binary structures that create a shortcut to a file or folder, but a lesser known feature is that they can directly contain a saved search. Ordinarily, when a user searches for a file in Windows 10, the “Search Tools” tab appears in the Explorer ribbon allowing a user to refine their search and select advanced options for the search query. The tab also allows users to save the existing search for re-use at a later time, which results in an XML file with the extension “.search-ms” to be saved, a file format that is only partially documented.
However, this is not the only way to save a search. If you click and drag the search results icon from the address bar, highlighted in the image below, to another folder, it creates a LNK file containing a serialized version of the data that would be contained in a “search-ms” XML file.
With this in mind, let’s take a look at the patch diff for StructuredQuery using BinDiff.
As we can see, only one function has changed, StructuredQuery1::ReadPROPVARIANT()
, and it appears to have changed quite extensively based on the similarity of only 81%. A quick comparison of the flow graphs would appear to confirm the changes are fairly extensive:
What exactly does this function do in the context of an LNK file? The answer requires a detailed look at the undocumented structures contained in a saved search LNK file, so let’s dive in and take a look.
Windows shell link files have several required and optional components as defined in the Shell Link (.LNK) Binary File Format specification. Each shell link file must contain, at a minimum, a Shell Link Header which has the following format:
All multi-byte fields are represented in little-endian byte order unless otherwise specified.
The LinkFlags field specifies the presence of absence of optional structures as well as various options such as whether the strings in the shell link are encoded in Unicode or not. The following is an illustrative layout of the LinkFlags field:
One flag that is set in the majority of cases, HasLinkTargetIDList, is represented by position “A”, the least significant bit of the first byte of the LinkFlags field. If set, the LinkTargetIDList structure must follow the Shell Link Header. The LinkTargetIDList structure specifies the target of the link and has the following layout:
The IDList structure contained within specifies the format of a persisted item ID list:
The ItemIDList serves the same purpose as a file path, where each ItemID structure corresponds to one path component in a path-like hierarchy. ItemIDs can refer to actual filesystem folders, virtual folders such as Control Panel or Saved Searches, or other forms of embedded data that serve as a “shortcut” to perform a specific function. For more general information on ItemIDs and ItemIDLists, see Microsoft’s Common Explorer Concepts. Of particular importance to the vulnerability are the ItemIDList and ItemID structures present in an LNK file that contains a saved search query.
When a user creates a shortcut that contains search information, the resulting file contains an IDList structure that begins with a Delegate Folder ItemID, followed a User Property View ItemID specific to search queries. In general, ItemID’s begin with the following structure:
The value of the two bytes beginning at offset 0x0004 are used in combination with the ItemSize and ItemType to help determine the type of the ItemID. For example, if the ItemSize is 0x14 and the ItemType is 0x1F, the 2 bytes at 0x0004 are checked to see if their value is greater than ItemSize. If so, it is assumed that the remaining ItemID data will consist of a 16-byte Globally Unique Identifier (GUID). This is typically the structure of the first ItemID found in a LNK file pointing to a file or folder. If the ItemSize is larger than the size required to contain a GUID but smaller than the bytes at 0x0004, the remaining data after the GUID is considered an ExtraDataBlock, beginning with a 2-byte size field followed by that many bytes of data.
For a Delegate Folder ItemID, the same 2 bytes correspond to a size field for the remaining structure, leading to the following overall structure:
All GUIDs in LNK files are stored using the RPC IDL representation for GUIDs. RPC IDL representation means the first three segments of the GUID are stored as little-endian representations of the entire segment (i.e., a DWORD followed by 2 WORDs), whereas each byte in the last 2 segments are considered to be individual. For example, the GUID {01234567-1234-ABCD-9876-0123456789AB}
has the following binary representation:
\x67\x45\x23\x01\x34\x12\xCD\xAB\x98\x76\x01\x23\x45\x67\x89\xAB
The precise function of Delegate Folder ItemIDs is undocumented. However, it is likely that such an entry is intended to cause subsequent ItemIDs to be handled by the class specified by the Item GUID field, thus establishing that class as the root namespace for the hierarchy. In the case of a LNK file containing embedded search data, the Item GUID will be {04731B67-D933-450A-90E6-4ACD2E9408FE}, corresponding to CLSID_SearchFolder, a reference to Windows.Storage.Search.dll.
The Delegate Folder ItemID is followed by a User Property View ItemID, which has a structure similar to the structure of a Delegate Folder ItemID:
Of particular importance to this report is the PropertyStoreList field, which, if present, contains one or more serialized PropertyStore items each having the following structure:
The Property Store Data field is a sequence of properties. All properties in a given PropertyStore belong to the class identified by the Property Format GUID. Each specific property is identified by a numeric ID known as a Property ID or PID which, when combined with the Property Format GUID, is known as a property key or PKEY.
The PKEY is determined in a slightly different manner if the Property Format GUID is equal to {D5CDD505-2E9C-101B-9397-08002B2CF9AE}
. Each property is then considered to be part of a “Property Bag” and has the following structure:
Property bags will generally contain elements with the names “Key:FMTID” and “Key:PID”, identifying the specific PKEY that determines the interpretation of the other elements. Specific Property Bag implementations will also require that other elements are present in order to be valid.
If the Property Format GUID is not equal to the previously mentioned GUID for Property Bags, each property is identified by an integer value for the PID and has the following structure:
The TypedPropertyValue field corresponds to the typed value of a property in a property set as defined in section 2.15 of the Microsoft Object Linking and Embedding (OLE) Property Set Data Structures specification.
Various PKEYs are defined in the headers provided with the Windows SDK. However, many are undocumented and only identifiable by examining references in the debugging symbols for the associated libraries. For LNK files containing embedded search data, the first PropertyStore in the User Property View ItemID has a Property Format GUID of {1E3EE840-BC2B-476C-8237-2ACD1A839B22}
containing a property with an Id of 2, which corresponds to PKEY_FilterInfo.
The TypedPropertyValue field of PKEY_FilterInfo consists of a VT_STREAM property. Ordinarily, a VT_STREAM property consists of a type of 0x0042, 2 padding bytes, and an IndirectPropertyName that specifies the name of an alternate stream containing either a PropertySetStream packet for simple property set storage or the “CONTENTS” stream element for non-simple property set storage as per Microsoft’s documentation. This name is specified with the wide character string “prop” followed by a decimal string corresponding to a property identifier in a PropertySet packet. However, because LNK files use serialized property stores embedded in VT_STREAM properties, the IndirectPropertyName is only checked to see if it begins with “prop”. The value itself is ignored. This results in the following TypedPropertyValue structure:
The contents of the Stream Data field are dependent on the specific PKEY that the stream property belongs to. For PKEY_FilterInfo, the Stream Data essentially contains an embedded PropertyStoreList with more serialized PropertyStore structures and has the following structure:
The nested PropertyStoreList in the PKEY_FilterInfo stream is a serialized version of the “conditions” tag in a .search-ms file. The following is an example structure of the conditions tag:
The precise functionality of the attribute
element is not publicly documented. However, an attribute
element contains a GUID that corresponds to CONDITION_HISTORY
, and a CLSID that corresponds to the CConditionHistory class in StructuredQuery, which would imply that the nested condition and attribute structures represent the history of the search query before it was saved. In general, it appears that the chs
attribute of the attribute
element determines whether any additional history is present. When this structure is serialized into a property store, it is placed into the PKEY_FilterInfo PropertyStoreList, which takes the form of a property bag with the aforementioned Property Format GUID. More specifically, the serialized Conditions structure is contained in a VT_STREAM Property which is identified by the name “Condition”. This results in a PropertyStore item having the following structure:
The Condition object is generally either a “Leaf Condition” or a “Compound Condition” object that contains a number of nested objects generally including one or more Leaf Condition objects and possibly additional Compound Condition objects. Both condition objects begin with the following structure:
The Condition GUID will be {52F15C89-5A17-48E1-BBCD-46A3F89C7CC2}
for a Leaf Condition and {116F8D13-101E-4FA5-84D4-FF8279381935}
for a Compound Condition. The Attributes field consists of attribute structures, where the number of attributes structures is defined by the field “Number of Attributes”. Each attribute structure corresponds to an attribute
element from the .search-ms file and begins with the following structure:
The remaining structure of an Attribute depends on the AttributeID and CLSID. For the aforementioned CONDITION_HISTORY
attribute, the attributeID is set to {9554087B-CEB6-45AB-99FF-50E8428E860D}
and has a CLSID of {C64B9B66-E53D-4C56-B9AE-FEDE4EE95DB1}
. The remaining structure will be a ConditionHistory object having the following structure. Note that the fields are named the same as the matching attributes of the attribute
XML element:
If the value of has_nested_condition
is greater than zero, the CONDITION_HISTORY
attribute will have a nested condition object, which may itself have nested attributes that have nested conditions and so on.
Once the top-level attribute is fully read including all nested structures, the Compound Condition and Leaf Condition structures begin to differ. The remaining structure of a Compound Condition is as follows, with offsets relative to the end of the Attributes field:
The numFixedObjects
field determines how many additional conditions (typically Leaf Conditions) will immediately follow.
The remaining structure of a Leaf Condition is as follows, with offsets relative to the end of the Attributes field:
The presence of the TokenInformationComplete
structures depends on whether the preceding flag is set. If it is not set, the structure is not present, and the next flag immediately follows. If it is set, the following structure is present:
In summary, the following tree shows the simplest possible structure of an LNK file with a saved search, with irrelevant structures removed for simplicity:
Keep in mind a search with a single Leaf Condition results in the simplest structure. More often than not, a saved search LNK file will begin with a Compound Condition and many nested structures including many Leaf Conditions.
The Vulnerability
Now that we’ve explained the core structure of a saved search LNK file, we can take a look at the vulnerability itself, which lies in how the PropertyVariant field of Leaf Conditions are handled.
The PropertyVariant field of a Leaf Condition roughly corresponds to a PROPVARIANT structure. PropertyVariant structures consist of a 2-byte type followed by data specific to that type. It is important to note that StructuredQuery appears to have a slightly custom implementation of the PROPVARIANT structure as the padding bytes specified in Microsoft’s specification are generally not present.
It is also important to note that a value of 0x1000, or VT_VECTOR, combined with another type, means that there will be several values of the specified type.
Parsing of the PropertyVariant field is handled by our previously mentioned vulnerable function, StructuredQuery1::ReadPROPVARIANT()
. The function first reads the 2-byte type and checks to see if VT_ARRAY (0x2000) is set, due to the fact it is not supported in StructuredQuery:
The function then checks if the type is VT_UI4 (0x0013) and if not, enters a switch statement to handle all other types.
The vulnerability itself lies in how a PropertyVariant with a type of VT_VARIANT (0x000C) is handled. The VT_VARIANT type is typically used in combination with VT_VECTOR which effectively results in a series of PropertyVariant structures. In other words, this is like having an array where members of the array may be of any data type.
When the type of the PropertyVariant is set to VT_VARIANT (0x000C), the full type field is checked to see if VT_VECTOR is set.
If VT_VECTOR is not set, a 24 byte buffer is allocated with a call to CoTaskMemAlloc()
and the buffer is passed to a recursive call to ReadPROPVARIANT()
, with the intention that the buffer will be filled with the property that immediately follows the VT_VARIANT field. However, the buffer is not initialized (e.g. filled with NULL bytes) before it is passed to ReadPROPVARIANT()
.
If the nested property has a type of VT_CF (0x0047), a property that is intended to contain a pointer to clipboard data, ReadPROPVARIANT()
performs the same check for VT_VECTOR and if it is not set, attempts to write the next 4 bytes of the stream into a location pointed to by an 8-byte value in the previously allocated 24 byte buffer.
Due to the fact the buffer was not initialized, the data will be written to an undefined memory location, potentially leading to arbitrary code execution. The attempted data write can be seen in the following exception and partial stack trace from WinDBG with Page Heap enabled on explorer.exe:
Essentially, if an attacker can manipulate the memory layout in just the right way so that the uninitialized buffer contains a value they control, they can write any data 4 bytes at a time to a memory address of their choosing.
Conclusion
Analysis of a patched vulnerability wouldn’t be complete without mentioning how the vulnerability was resolved. In this particular case, the solution was simple; fill the newly allocated 24-byte buffer with NULL bytes, ensuring that an attacker is unable to utilize data in the buffer leftover from previous uses of that memory location. Microsoft released their patch in February. It should be noted that Microsoft addressed another LNK vulnerability in March, but the March patch is unrelated to this particular bug.
Special thanks to John Simpson and Pengsu Cheng of the Trend Micro Research Team for providing such a thorough analysis of this vulnerability. For an overview of Trend Micro Research services please visit http://go.trendmicro.com/tis/.
The threat research team will be back with other great vulnerability analysis reports in the future. Until then, follow the ZDI team for the latest in exploit techniques and security patches.