Fuzzing Image Parsing in Windows, Part Two: Uninitialized Memory

Read the original article: Fuzzing Image Parsing in Windows, Part Two: Uninitialized Memory


Continuing our discussion of image
parsing vulnerabilities in Windows
, we take a look at a
comparatively less popular vulnerability class: uninitialized memory.
In this post, we will look at Windows’ inbuilt image
parsers—specifically for vulnerabilities involving the use of
uninitialized memory.

The Vulnerability: Uninitialized Memory

In unmanaged languages, such as C or C++, variables are not
initialized by default. Using uninitialized variables causes undefined
behavior and may cause a crash. There are roughly two variants of
uninitialized memory:

  • Direct uninitialized memory usage: An uninitialized pointer or
    an index is used in read or write. This may cause a crash.
  • Information leakage (info leak) through usage of uninitialized
    memory: Uninitialized memory content is accessible across a security
    boundary. An example: an uninitialized kernel buffer accessible from
    user mode, leading to information disclosure.

In this post we will be looking closely at the second variant in
Windows image parsers, which will lead to information disclosure in
situations such as web browsers where an attacker can read the decoded
image back using JavaScript.

Detecting Uninitialized Memory Vulnerabilities

Compared to memory corruption vulnerabilities such as heap overflow
and use-after-free, uninitialized memory vulnerabilities on their own
do not access memory out of bound or out of scope. This makes
detection of these vulnerabilities slightly more complicated than
memory corruption vulnerabilities. While direct uninitialized memory
usage can cause a crash and can be detected, information leakage
doesn’t usually cause any crashes. Detecting it requires compiler
instrumentations such as MemorySanitizer or binary
instrumentation/recompilation tools such as Valgrind.

Detour: Detecting Uninitialized Memory in Linux

Let’s take a little detour and look at detecting uninitialized
memory in Linux and compare with Windows’ built-in capabilities. Even
though compilers warn about some uninitialized variables, most of the
complicated cases of uninitialized memory usage are not detected at
compile time. For this, we can use a run-time detection mechanism.
MemorySanitizer is a compiler instrumentation for both GCC and Clang,
which detects uninitialized memory reads. A sample of how it works is
given in Figure 1.

$ cat sample.cc
#include
<stdio.h>

int main()
{
    int *arr =
new int[10];
    if(arr[3] == 0)
    {
         printf("Yay!\n");
    }
   
printf("%08x\n", arr[3]);
    return
0;
}

$ clang++ -fsanitize=memory
-fno-omit-frame-pointer -g sample.cc

$ ./a.out
==29745==WARNING:
MemorySanitizer: use-of-uninitialized-value
    #0
0x496db8  (/home/dan/uni/a.out+0x496db8)
    #1
0x7f463c5f1bf6 
(/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
    #2
0x41ad69  (/home/dan/uni/a.out+0x41ad69)

SUMMARY: MemorySanitizer:
use-of-uninitialized-value
(/home/dan/uni/a.out+0x496db8)
Exiting

Figure 1: MemorySanitizer detection of
uninitialized memory

Similarly, Valgrind can also be used to detect uninitialized memory
during run-time.

Detecting Uninitialized Memory in Windows

Compared to Linux, Windows lacks any built-in mechanism for
detecting uninitialized memory usage. While Visual Studio and Clang-cl
recently introduced AddressSanitizer
support
, MemorySanitizer and other sanitizers are not implemented
as of this writing.

Some of the useful tools in Windows to detect memory corruption
vulnerabilities such as PageHeap
do not help in detecting uninitialized memory. On the contrary,
PageHeap fills the memory allocations with patterns, which essentially
makes them initialized.

There are few third-party tools, including Dr.Memory, that use
binary instrumentation to detect memory safety issues such as heap
overflows, uninitialized memory usages, use-after-frees, and others.

Detecting Uninitialized Memory in Image Decoding

Detecting uninitialized memory in Windows usually requires binary
instrumentation, especially when we do not have access to source code.
One of the indicators we can use to detect uninitialized memory usage,
specifically in the case of image decoding, is the resulting pixels
after the image is decoded.

When an image is decoded, it results in a set of raw pixels. If
image decoding uses any uninitialized memory, some or all of the
pixels may end up as random. In simpler words, decoding an image
multiple times may result in different output each time if
uninitialized memory is used. This difference of output can be used to
detect uninitialized memory and aid writing a fuzzing harness
targeting Windows image decoders. An example fuzzing harness is
presented in Figure 2.

#define ROUNDS 20

unsigned char* DecodeImage(char
*imagePath)
{
      unsigned char *pixels =
NULL;     

      // use GDI or WIC to decode image and
get the resulting pixels
      …
      … 
   

      return pixels;
}

void Fuzz(char *imagePath)
{
      unsigned char *refPixels = DecodeImage(imagePath);   
 

      if(refPixels != NULL)
     
{
            for(int i = 0; i < ROUNDS;
i++)
            {
                  unsigned
char *currPixels = DecodeImage(imagePath);
         
        if(!ComparePixels(refPixels, currPixels))
   
              {
                        // the
reference pixels and current pixels don’t match
     
                  // crash now to let the fuzzer know of
this file
                       
CrashProgram();
                  }
         
        free(currPixels);
            }
     
      free(refPixels);
      }
}

Figure 2: Diff harness

The idea behind this fuzzing harness is not entirely new;
previously, lcamtuf
used a similar idea to detect uninitialized memory in open-source
image parsers and used a web page to display the pixel differences.

Fuzzing

With the diffing harness ready, one can proceed to look for the
supported image formats and gather corpuses. Gathering image files for
corpus is considerably easy given the near unlimited availability on
the internet, but at the same time it is harder to find good corpuses
among millions of files with unique code coverage. Code coverage
information for Windows image parsing is tracked from WindowsCodecs.dll.

Note that unlike regular Windows fuzzing, we will not be enabling
PageHeap this time as PageHeap “initializes” the heap allocations with patterns.

Results

During my research, I found three cases of uninitialized memory
usage while fuzzing Windows built-in image parsers. Two of them are
explained in detail in the next sections. Root cause analysis of
uninitialized memory usage is non-trivial. We don’t have a crash
location to back trace, and have to use the resulting pixel buffer to
back trace to find the root cause—or use clever tricks to find the deviation.

CVE-2020-0853

Let’s look at the rendering of the proof of concept (PoC) file
before going into the root cause of this vulnerability. For this we
will use lcamtuf’s HTML, which loads the PoC image multiple times and
compares the pixels with reference pixels.



Figure 3: CVE-2020-0853

As we can see from the resulting images (Figure 3), the output
varies drastically in each decoding and we can assume this PoC leaks a
lot of uninitialized memory.

To identify the root cause of these vulnerabilities, I used Time
Travel Debugging (TTD) extensively. Tracing back the execution and
keeping track of the memory address is a tedious task, but TTD makes
it only slightly less painful by keeping the addresses and values
constant and providing unlimited forward and backward executions. 

After spending quite a bit of time debugging the trace, I found the
source of uninitialized memory in windowscodecs!CFormatConverter::Initialize. Even
though the source was found, it was not initially clear why this
memory ends up in the calculation of pixels without getting
overwritten at all. To solve this mystery, additional debugging was
done by comparing PoC execution trace against a normal TIFF file
decoding. The following section shows the allocation, copying of
uninitialized value to pixel calculation and the actual root cause of
the vulnerability.

Allocation and Use of Uninitialized Memory

windowscodecs!CFormatConverter::Initialize
allocates 0x40 bytes of memory, as shown in Figure 4.

0:000> r
rax=0000000000000000
rbx=0000000000000040 rcx=0000000000000040
rdx=0000000000000008 rsi=000002257a3db448
rdi=0000000000000000
rip=00007ffaf047a238
rsp=000000ad23f6f7c0 rbp=000000ad23f6f841
 r8=000000ad23f6f890  r9=0000000000000010
r10=000002257a3db468
r11=000000ad23f6f940
r12=000000000000000e r13=000002257a3db040
r14=000002257a3dbf60 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
cs=0033 
ss=002b  ds=002b  es=002b  fs=0053  gs=002b            
efl=00000246
windowscodecs!CFormatConverter::Initialize+0x1c8:
00007ffa`f047a238 ff15ea081200    call    qword ptr
[windowscodecs!_imp_malloc (00007ffa`f059ab28)]
ds:00007ffa`f059ab28={msvcrt!malloc
(00007ffa`f70e9d30)}
0:000> k
 #
Child-SP          RetAddr               Call Site
00
000000ad`23f6f7c0 00007ffa`f047c5fb    
windowscodecs!CFormatConverter::Initialize+0x1c8
01
000000ad`23f6f890 00007ffa`f047c2f3    
windowscodecs!CFormatConverter::Initialize+0x12b
02
000000ad`23f6f980 00007ff6`34ca6dff    
windowscodecs!CFormatConverterResolver::Initialize+0x273

//Uninitialized memory after
allocation
:
0:000> db @rax
00000225`7a3dbf70  d0 b0 3d 7a 25 02 00 00-60 24 3d 7a 25 02
00 00  ..=z%…`$=z%…
00000225`7a3dbf80  00 00 00
00 00 00 00 00-00 00 00 00 00 00 00 00 
…………….
00000225`7a3dbf90  00 00 00 00 00 00
00 00-00 00 00 00 00 00 00 00  …………….
00000225`7a3dbfa0  00 00 00 00 00 00 00 00-00 00 00 00 00 00
00 00  …………….
00000225`7a3dbfb0  00 00 00
00 00 00 00 00-00 00 00 00 00 00 00 00 
…………….
00000225`7a3dbfc0  00 00 00 00 00 00
00 00-64 51 7c 26 c3 2c 01 03  ……..dQ|&.,..
00000225`7a3dbfd0  f0 00 2f 6b 25 02 00 00-f0 00 2f 6b 25 02
00 00  ../k%…../k%…
00000225`7a3dbfe0  60 00 3d
7a 25 02 00 00-60 00 3d 7a 25 02 00 00 
`.=z%…`.=z%…

Figure 4: Allocation of memory

The memory never gets written and the uninitialized values are
inverted in windowscodecs!CLibTiffDecoderBase::HrProcessCopy
and further processed in windowscodecs!GammaConvert_16bppGrayInt_128bppRGBA
and in later called scaling functions.

As there is no read or write into uninitialized memory before
HrProcessCopy, I traced the execution back from HrProcessCopy and
compared the execution traces with a normal tiff decoding trace. A
difference was found in the way windowscodecs!CLibTiffDecoderBase::UnpackLine
behaved with the PoC file compared to a normal TIFF file, and one of
the function parameters in UnpackLine was a
pointer to the uninitialized buffer.

The UnpackLine function has a series of
switch-case statements working with bits per sample (BPS) of TIFF
images. In our PoC TIFF file, the BPS value is 0x09—which is not
supported by UnpackLine—and the control flow
never reaches a code path that writes to the buffer. This is the root
cause of the uninitialized memory, which gets processed further down
the pipeline and finally shown as pixel data.

Patch

After presenting my analysis to Microsoft, they decided to patch the
vulnerability by making the files with unsupported BPS values as
invalid. This avoids all decoding and rejects the file in the very
early phase of its loading.

CVE-2020-1397



Figure 5: Rendering of CVE-2020-1397

Unlike the previous vulnerability, the difference in the output is
quite limited in this one, as seen in Figure 5. One of the simpler
root cause analysis techniques that can be used to figure out a
specific type of uninitialized memory usage is comparing execution
traces of runs that produce two different outputs. This specific
technique can be helpful when an uninitialized variable causes a
control flow change in the program and that causes a difference in the
outputs. For this, a binary instrumentation script was written, which
logged all the instructions executed along with its registers and
accessed memory values.

Diffing two distinct execution traces by comparing the instruction
pointer (RIP) value, I found a control flow change in windowscodecs!CCCITT::Expand2DLine due to a usage
of an uninitialized value. Back tracing the uninitialized value using
TTD trace was exceptionally useful for finding the root cause. The
following section shows the allocation, population and

[…]


Read the original article: Fuzzing Image Parsing in Windows, Part Two: Uninitialized Memory