Tuesday, August 17, 2010

Skeletons Hidden in the Linux Closet: r00ting your Linux Desktop for Fun and Profit

A couple of months ago, while working on Qubes GUI virtualization, Rafal has come up with an interesting privilege escalation attack on Linux (a user-to-root escalation), that exploits a bug in... well, actually it doesn't exploit any concrete bug, which makes it so much more interesting.

The attack allows a (unpriviliged) user process that has access to the X server (so, any GUI application) to unconditionally escalate to root (but again, it doesn't take advantage of any bug in the X server!). In other words: any GUI application (think e.g. sandboxed PDF viewer), if compromised (e.g. via malicious PDF document) can bypass all the Linux fancy security mechanisms, and escalate to root, and compromise the whole system. The attack allows even to escape from the SELinux's "sandbox -X" jail. To make it worse, the attack has been possible for at least several years, most likely since the introduction of kernel 2.6.

You can find the details of the attack, as well as the discussion of possible solutions, including the one that has eventually been implemented, in the Rafal's paper.

One important aspect the attack demonstrates, is how difficult it is to bring security to a desktop platform, where one of the biggest challenges is to let applications talk to the GUI layer (e.g. X server in case of Linux), which usually involves a very fat GUI protocol (think X protocol, or Win32 GUI API) and a very complex GUI server, but at the same time keep things secure. This was one of the key priories for us when designing Qubes OS architecture. (So, we believe Qubes is much more secure than other sandboxing mechanisms, such as BSD jails, or SELinux-based sandboxes, because it not only eliminates kernel-level exploits, but also dramatically slims down GUI-level attacks).

The kernel-level "patch" has been implemented last week by Linus Torvalds, and pushed upstream into recent stable kernels. RedHat has also released an advisory for this attack, where they rated its severity as "high".

ps. Congrats to Brad Spengler for some good guessing :)

29 comments:

Anonymous said...

Unusually good for a guess, wouldn't you say? ;)

Linus screwed up the fix (that's what happens when you try to push a silent fix out before it's been reviewed and tell -stable to pull it too) so people will have to wait till the next stable release or pull specific commits.

-Brad

Anonymous said...

More importantly, how have you "eliminate[d] kernel-exploits"? Seems like you're getting a little ahead of yourselves.

It takes actual work and code to accomplish this, not hand-waving and an additional (buggy) layer of abstraction.

-Brad

Joanna Rutkowska said...

@Brad: For more detailed analysis see chapter 8 in the Qubes arch spec:

http://qubes-os.org/files/doc/arch-spec-0.3.pdf

In short: we don't care about user->ring0 attacks in Qubes' AppVMs.

We do care about VM->Xen attack of course, but these are quite a different animal.

Joanna Rutkowska said...

@Brad: In Qubes we're not *adding* an additional layer of abstraction -- we're *replacing* the buggy Linux monolithic kernel, with something orders of magnitude less buggy (Xen, guid, and a few more discussed in the spec).

Joanna Rutkowska said...

Correction:

"In short: we don't care about user->ring0 attacks in Qubes' AppVMs."

should be:

"In short: we don't care about user->kernel attacks in Qubes' AppVMs."

Of course kernel runs in Ring3 on Xen x64.

Anonymous said...

Root in a VM is still root.
Which user maintains extra-security twice or more often within the VMs?

For modern client sided attacks it doesn't matter whether you isolate your VMs. If an attacker is after the data he'll simply get the mails/files/whatever from the work/home online VM. So what's the difference... you have to keep your data accessible to make the VMs able to interact with each other (on a desktop operating system).

Currently Desktops evolve to a semantic usability experience, which practically means: people want their data indexed, available, and easily accessible. And not isolated within some VM that's offline or not interacting.

Last but not least one of the main problems is, that you will introduce improved reliability for exploits. Within the VMs kernel hardending is practically impossible. So it'll be the (older) standard kernel, with the standard devices.
Means for me: nice for my home lab ;) Great project.

Joanna Rutkowska said...

@Anonymous:

1) Root in a VM is still root.

In Qubes OS we assume the attacker already has root in the AppVM that they have compromised (e.g. "random" AppVM). We don't care. Root is *not* root any more in Qubes!

2) If an attacker is after the data he'll simply get the mails/files/whatever from the work/home online VM. So what's the difference... you have to keep your data accessible to make the VMs able to interact with each other (on a desktop operating system).

That's simply not true, and I encourage you to check the previously mentioned Qubes spec for explanations why.

3) Currently Desktops evolve to a semantic usability experience, which practically means: people want their data indexed, available, and easily accessible. And not isolated within some VM that's offline or not interacting.

I would argue that there is no need to mix e.g. my work data with my personal data, with my healthcare data, with my banking data, etc...

But Qubes (in the future version) will allow to present all the user data as if they were all on one system, but still to maintain decent isolation (right now it does it with apps, but the user must use different file manager, in the future there will be just one file manager).

4) Last but not least one of the main problems is, that you will introduce improved reliability for exploits. Within the VMs kernel hardending is practically impossible.

We *really* don't care about the attacker going from user-to-root in a VM.

Anonymous said...

I don't think anything useful will come of any discussion on this topic, as Joanna is playing a security shell game, normally reserved for confused academics and SELinux zealots.

Let's count it up:
1. Redefining "kernel" away from "most privileged code"
2. Applying very odd threat models to Linux desktops
3. Ignoring attacker characteristics associated with said threat models (see 5,6.)
4. Passing TCB problem onto buggy hypervisor
5. Hypervisor isn't protected, except by code removal -- assumption being also that bugs weren't introduced by the modifications
6. Actual hypervisor (host->self, host->guest) protection left as an academic-style "future work"
7. Ignoring persistent VM-compromise (unless you have unique *stateless* VMs for each individual site or document you process; sorry no saving bookmarks)
8. Ignoring userland compromise, ignoring kernel guest compromise (even though this opens up more Xen attack surface)
9. Claiming to not care about "root" while redefining "root" to something most attackers don't care or need to obtain, particularly on desktops
(people care about their mailspools, their saved usernames/passwords, the privacy of their encrypted communications, etc)
10. Some functionality dependent upon hardware not used by most people (desktop environment)
11. Isolation provided highly dependent upon proper use in opposition to usability
(this is useful as the architecture document can be pointed to to inform the compromised user how they don't care about security because they didn't split
up their usage into 10+ AppVMs, aka blame the user for any failure of the system/inapplicability to real life)
12. Old system: buggy userland -> buggy privileged linux kernel
New system: buggy userland -> same buggy unprivileged (sortof) linux kernel -> buggy xen -> buggy privileged linux kernel

(splitting this up as the comment's too long)

Anonymous said...

(continued...)

Not only is it false to say kernel-level exploits are eliminated, but the whole sentence doesn't say anything. I can just as accurately and vacuously say "I believe a physical air-gap is much more secure than other sandboxing mechanisms." You miss something very important if you miss this extension of the logic. Talking about security of sandboxes where you've just shifted around the attack surface demonstrates nothing. Usability, performance, complex applications, real life -- these things matter.

So again I say, it takes actual work and code to accomplish what you've claimed, not hand-waving and an additional (buggy) layer of abstraction.

Or as Wittgenstein said, "now you are only playing with words."

-Brad

PaX Team said...

Hi Joanna,

i think there's a misunderstanding regarding 'kernel exploits'. to explain i'll give you an analogy.

so-called return-to-libc attacks have been publicly known since at least solar_diz's famoust post on bugtraq in the previous millennium. i think it's obvious to anyone skilled in the art that the name of the technique itself has never prevented anyone from writing an exploit that relied on another piece of code, or even arbitrary chunks of code (vs. libc API calls). in fact we know it for a fact since there're even public exploits that do just that, many years before the current marketing distractment called ROP/JOP/etc entered the public mind as if they were something new. so what we can see here is an exploit technique named after its first public incarnation but that doesn't mean it wasn't meant to cover the whole set of techniques that relies on executing existing code, wherever it may be in memory.

now the analogy is that 'kernel exploits' never meant 'guest kernel exploits' (regardless of ring-0 or ring-3, or whatever), but they always meant 'exploits against bugs in the TCB'. and since in a traditional OS where such exploits were written originally the kernel code running in ring-0 happens to be part of the TCB, the name stuck.

now what this means for Qubes is that your claim about eliminating 'kernel-level exploits' would require no exploitable bugs in Xen (the hypervisor) and dom0 at least (be that by provable construction or just intrusion prevention techniques). and i think we all agree that you don't have such a thing, nor does anyone else to date ;). the alternative intrepretation of your claim would be that you solved the problem by declaring it not to be a problem (e.g., "we don't care about user->ring0 attacks in Qubes ' AppVMs", "In Qubes OS we assume the attacker already has root in the AppVM" or "We *really* don't care about the attacker going from user-to-root in a VM.") and i don't think you really meant that ;).

Joanna Rutkowska said...

@PaX Team:

I don't agree that one can extend the meaning of "kernel exploits" to also cover "hypervisor exploits", especially in case of such a thin hypervisor as Xen, or even "TCB exploits" (in Qubes TCB is not much more than the Xen hypervisor).

The main difference is that Xen has at least 2 orders of magnitude less code than a typical monolithic kernel (Linux, Windows, Mac), and this is possible because, unlike a normal OS, the (decent) hypervisor doesn't need to implement a whole lot of services (filesystems, hundreds or drivers, netoworking stacks, etc). This difference cannot be underestimated -- it's a huge security win.

In Qubes architecture we don't care whether the attacker has "just" user privileges, or full root privileges in the AppVM -- because there is no gain for the attacker in having root over having "just" user. If the attacker gained user access to the AppVM, this AppVM is considered to be *fully* compromised, period.

One can argue, like Brad, that having root is usually needed to start a further attack from VM to Xen (or e.g. guid perhaps), but Xen has been designed with the assumption that it should withstand attacks from fully compromised (root-ed) VMs. So, if there was a bug in Xen that required a root in the AppVM to compromise Xen, it would be a fatal bug in Xen, and the only way around this would have to be patch it. If I was to make bets, I would say we would see "a few more" VM->Xen bugs over the next two years. So far we have seen just one: the heap overflow in the (optional) NSA's Flask extension to Xen found by Rafal in 2008.

As for "exploitable bugs in Dom0" -- please note that the interface between AppVMs and Dom0 is very thin -- it comes down to 2 (3) things:
1) Qubes GUI protocol (served by some 2k LOC server, guid, in Dom0 side)
2) xenstored (similarly simple)
3*) storage backends, that will be eliminated in Qubes 2.0 when TXT will finally become widespread (and fixed -- think: STM) and we will be able to have truly unprivileged storage domain.

Mikael Ståldal said...

> In Qubes architecture we don't care
> whether the attacker has "just" user
> privileges, or full root privileges
> in the AppVM -- because there is no
> gain for the attacker in having root
> over having "just" user.

You imply that we should care about root privilege on a normal Linux machine. But why should we?

On a normal Linux desktop machine with only one user account, there is little gain for an attacker to get root privilege, much damage can be done with only user privilege.

(Servers and desktop machines with several user accounts is a different story though.)

The main security problem here is in the PDF/whatever viewer if it is possible to do malicious stuff with a PDF/whatever document.

Joanna Rutkowska said...

@Mikael:

Well, one might argue that when the attacker has root access on a normal Linux desktop, it would be easier for her to install all sorts of various persistent rootkits, even such that would survive reinstall. On Qubes this is not possible, because each VM get a fresh root filesystem upon start.

Joanna Rutkowska said...

@Mikeal: ps. obviously the (widespread) model of a desktop system where all the apps run with the same user account/domain/privilege is sick. Yes, all the mainstream OSes implement it.

kryptart said...

I use FreeBSD.
Is the attack possible so also under FreeBSD (without jails)?

Rafal Wojtczuk said...

@kryptart: Most probably the attack is not possible under FreeBSD; last time I looked it provided stack/heap separation, so is immune to the root issue. It would be good to check the current version, though.

Joanna Rutkowska said...

@kryptart, @rafal:

Nevertheless, please note that FreeBSD (nor any other UNIX system AFAIK) makes any effort to protect against GUI-level attacks performed by one GUI application towards others, such as input sniffing/spoofing, screen capturing, etc.

Also, there is this fat X protocol that represents a rather big attack surface, that, when successfully attacked, gives the attacker root access (unless the X server is somehow priv-sep'ed, which might be the case on OpenBSD).

Anonymous said...

"On Qubes this is not possible, because each VM get a fresh root filesystem upon start."

It's patently false that rootkits can't exist with persistence on the AppVMs simply because you provide a fresh root filesystem upon start. If you expected a real security boundary to exist in Qubes in this sense, then not only is it fundamentally flawed, but you also have to give up your "not caring" about what goes on in the AppVMs.

Not only that, but pretending this anti-persistence exists opens up an entirely new class of attacks to achieve persistence: crafted modification of "trusted" application data. Mail is being stored, bookmarks are being stored, browser caches, etc. Now a bug in the parsing of any of these things, which previously was never considered security relevant, can be abused to achieve persistence.

-Brad

Joanna Rutkowska said...

@Brad:

Sigh... we write exactly about those attack vectors in the Qubes spec in chapter 4.6.

(Is re-discovering somebody's else findings your speciality? :)

The statement you quoted (out of context) referred to such persistent infections that survive reinstall (in Qubes terminology: VM-recreation).

BTW, one way to deal with this problem (in some scenarios) are Disposable VMs:

http://theinvisiblethings.blogspot.com/2010/06/disposable-vms.html

(code already in the git).

Mikael Ståldal said...

>Well, one might argue that when the
>attacker has root access on a normal
>Linux desktop, it would be easier for
>her to install all sorts of various
>persistent rootkits, even such that
>would survive reinstall.

It is no problem to install a malicious daemon which is auto started when the user logs in (and thus survive reboot) on a Linux desktop machine without root priv. And it can do much about everything except listen on port <1024.

If it is discrete, the user may not discover it and is not likely to reinstall anything.

And if the user reinstall everything except his home directory, it may well survive that.

But I guess it will be hard to be "stealth" like a rootkit without root priv, so it will be easier to find if you go looking for it.

PaX Team said...

> I don't agree that one can extend the
> meaning of "kernel exploits" to also
> cover "hypervisor exploits" [...]

well then you should back up your disagreement with some solid arguments ;).

first you'll have to explain the ret2libc analogy (as in, why it doesn't apply). as far as i know, history backs up my story so far, not yours, i.e. it's not a matter of agreement, it's what it was/is.

also think about your own statement in your comments: you said there were no privilege boundaries within a domU as far as your threat model is concerned, but then it makes no sense to talk about guest kernel exploitability vs. that of guest userland since they're the same privilege domain (if they are not then you have some contradiction to resolve here as you can't have it both ways). therefore you must have meant something else other than 'guest kernel exploits' which of course naturally leads to the actual privilege boundaries in Qubes: the hypervisor and dom0 and the other special VMs.

> especially in case of such a thin hypervisor
> as Xen, or even "TCB exploits"

i don't understand why the size of the hypervisor matters for the purposes of 'kernel exploit' ;). hint: the smaller the hypervisor's size, the bigger the rest of the TCB (dom0 and its equivalents in Qubes) needs to be. i.e., your argument works against you in fact.

> (in Qubes TCB is not much more than the Xen hypervisor).

except it's not true ;). your TCB isn't just the hypervisor but also all the code running in dom0 and the special VMs (such as your network or storage VMs). why that is the case is easy to see: your system's ultimate goal is to provide protection for end user data and if you don't have mechanisms to ensure that compromised VMs cannot communicate with each other (under the watchful eyes of the hypervisor), your system is broken by design. as far as i know, Qubes cannot ensure this VM separation except by assuming that certain VMs cannot be compromised - i.e., they are all part of the TCB. it follows then that guest kernel bugs are very important in the Qubes security model.

so all in all, whichever way you interpret 'kernel exploit', you very much need to care about them and i don't see where Qubes has any more protections against them than what already exists in linux. and we know how effective those protections are, a far cry from 'elimination'.

[continued in the next post]

Joanna Rutkowska said...

@Mikael: All true. I guess it shows how sick security model this is.

@PaX:

well then you should back up your disagreement with some solid arguments ;).

And that's exactly what I did.

/.../

also think about your own statement in your comments: you said there
were no privilege boundaries within a domU as far as your threat model
is concerned, but then it makes no sense to talk about guest kernel
exploitability vs. that of guest userland since they're the same
privilege domain


Exactly! And you should tell it to Brad, as he started this (futile) discussion.

your TCB isn't just the hypervisor but also all the code running in dom0 and the special VMs (such as your network or storage VMs).

The above statement is simply false. RTM.

/.../

so all in all, whichever way you interpret 'kernel exploit', you very much need to care about them

I still don't see why...

[continued in the next post]

Sorry, I got your comment in 4 different pieces most of them overlapping, I thought I moderated the longest piece, but apparently I clicked on the wrong one...

But, I think, in the interest of the readers, we should do EOT, as this discussion isn't going anywhere. If you think Qubes architecture is buggy, feel free to write some proof of concepts to back up your opinion -- we would be thrilled to see some exploits for Qubes, really!

Anonymous said...

> If you think Qubes architecture is
> buggy, feel free to write some proof of
> concepts to back up your opinion

Hehe, in order to prove that Xen is indeed just as buggy as Linux kernel, The PaX Team or Spengler would have to write *lots* of exploits for Quebes, I guess ;)

Anonymous said...

Good work and good constructive challenges also.

Anonymous said...

I'm curious if you actually wrote a POC for the SELinux sandbox scenario?

Anonymous said...

i have had a brief read of your architecture document.

can you explain the difference between

1. Traditional - opening a PDF containing an exploit which then leads to the exploitation of a secondary bug (say, kernel priv esc) to compromise the whole system.

2. Qubes - opening a PDF containing an exploit which then leads to the exploitation of a secondary bug in the hypervisor resulting in a compromise of the whole system.

If I compromise the storage VM (which presumably has the ability to decrypt/access all data) what's stopping me from abusing this to read the data from another app? Ditto network.

At the end, seems like the same impact - just a different way of getting there.

If I'm missing a massive point I'm sorry.

Joanna Rutkowska said...

@Anonymous:

2. Qubes - opening a PDF containing an exploit which then leads to the exploitation of a secondary bug in the hypervisor resulting in a compromise of the whole system.

1) The numbers: Xen hypervisor has around 200k LOC, while Linux/Windows/Mac kernels have tens of millions. Get it?

2 ) GUI isolation

If I compromise the storage VM (which presumably has the ability to decrypt/access all data) what's stopping me from abusing this to read the data from another app?

Nah, you didn't read the spec carefully, did you? The storage domain CANNOT read data of other VMs. So, the answer to your question is: crypto (and also Intel TXT, so you could not compromise the system via MBR attack).

Ditto network.

Have you heard about this cool thing called SSL, SSH? :)

Anonymous said...

Hello ITL people & Joanna !

I use different user accounts to manage/minimize data access only to program which need it (like you do in Qubes with VM). Sure, I already know this is not top security but 1) it is better than nothing, 2) a quite light, easy solution to setup and 3) it protects against almost every security issue not involving privileges escalation to root.

A security issue like this one, which give an attacker access to all of my data, just warns and remembers me to find a more serious solution. And after reading Qubes documentation, Qubes fits so much my needs for security that it seems I have written the "Why ?" part. You guess it : I'm really waiting for Qubes to be ready :)

PS: Half explaining in comment what's already well explained in the Qubes documentation do deserve your work on Qubes and is a waste of time. It's quite obvious that many comments here comes from maybe-skilled people, but from skilled people who didn't read the Qubes documentation and/or understand it's goal.

-- neskouik

James Pannozzi said...

Many thanks both for alerting us to something that was for me, completely unexpected (as I had put, perhaps unduly, faith in my fedora se-linux enhancement)
and for posting a link to the Qubes project, an idea which is, I believe, LONG overdue.

In the drive for performance and compatibility, there has been left behind more holes than most large pieces of swiss cheese. The design of an operating system which has its basis in security is refreshing and my only fear is that if it gets "too" good, you will have a visit from some government "official" or other warning you how dangerous, perhaps even illegal, allowing private citizens to have fully secure (rather than PRETEND SECURE) desktops.