Thursday, September 09, 2010

(Un)Trusting your GUI Subsystem

Why do we need secure desktop systems? Why support from hardware is necessary to build secure desktop OSes? Does virtualization make things more, or less complex? Why Dynamic RTM (Intel TXT) is better than Static RTM? Can we have untrusted GUI domain/subsystem?

I tried to cover those questions in my recent keynote at ETISS, and you can grab the slides here.

Particularly, the slide #18 presents the idealistic view of an OS that could be achieved through the use of hardware virtualization and trusted boot technologies. It might look very similar to many other pictures of virtualized systems one can see these days, but what makes it special is that all the dark gray boxes represent untrusted domains (so, their compromise is not security-critical, except for the potential of a denial-of-service).

No OS currently implements this architecture, even Qubes. We still have Storage and GUI subsystem in Dom0 (so they are both trusted), although we already know (we think) how to implement the untrusted storage domain (this is described in detail in the arch spec), and the main reason we don't have it now is that TXT market adoption is so poor, that very few people could make use of it.

The GUI subsystem is, however, a much bigger challenge. When we think about, it should really feel impossible to have an untrusted GUI subsystem, because the GUI subsystem really "sees" all the pixmaps that are to be displayed to the user, so also all the confidential emails, documents, etc. The GUI is different in nature than the networking subsystem, where we can use encrypted protocols to prevent the netvm from sniffing or meaningfully intercepting the application-generated traffic, or the storage subsystem, where we can use fs-encryption and trusted boot technologies to keep the storage domain off from reading or modifying the files used by apps in a meaningful ways. We cannot really encrypt the pixmaps (in the apps, or AppVMs), because for this to work we would need to have graphics cards that would be able to do the decryption and key exchange (note how this is different from the case of an untrusted storage domain, where there is no need for internal hardware encryption!), and the idea of putting, essentially an HTTPS webserver on your GPU is doubtful at best, because it would essentially move the target from the GUI domain to the GPU, and there is really no reason why lots-of-code in the GPU were any harder to attack than lots-of-code in the GUI domain...

So we came out recently with an idea of a Split I/O model that is also presented in my slides, where we separate the user input (keyboard, mouse), and keep it still in dom0 (trusted domain), from the output (GUI, audio), which is moved into an untrusted GUI domain. We obviously need to make sure that the GUI domain cannot "talk" to other domains, to make sure it cannot "leak out" the secrets that it "sees" while processing the various pixmaps. For this we need to have the hypervisor ensure that all the inter-domain shared pages mapped into the GUI domain are read-only for the GUI domain, and this would imply that we need the GUI protocol, exposed by the GUI domain to other AppVMs, to be unidirectional.

There are more challenges though, e.g. how to keep the bandwith of timing covert channels, such as those through the CPU caches, between the GUI domain and other AppVMs on a reasonably low level (please note the distinction between a covert channel, which require cooperation of two domains, and a side-channel, which requires just one domain to be malicious - the latter are much more of a theoretical problem, and are of a concern only in some very high security military systems, while the former are easy to implement in practice usually, and present a practical problem in this very scenario).

Another problem, that was immediately pointed out by the ETISS audience, is that an attacker, who compromised the GUI domain, can manipulate the pixmaps that are being processed in the GUI subsystem to present false picture to the user (remember, the attacker should have no way to send them out anywhere). This includes attacks such as button relabeling ("OK" becomes "Cancel" and the other way around), content manipulation ("$1,000,000" instead of "$100", and vice-versa), security labels spoofing ("red"-labeled windows becoming "green"-labeled), and so on. It's an open question how practical these attacks are, at least when we consider automated attacks, as they require ability to extract some semantics from the pixmaps (where is the button, where is the decoration), as well as understanding the user's actions, intentions, and behavior (just automatically relabeling my Friefox label to "green" would be a poor attack, as I would immediately realize something is going wrong). Nevertheless this is a problem, and I'm not sure how this could be solved with the current hardware architecture.

But do we really need untrusted GUI domain? That depends. Currently in Qubes the GUI subsystem is located in dom0, and thus it is fully trusted, and this also means that a potential compromise of the GUI subsystem is considered fatal. We try to make an attack on GUI as hard as possible, and this is the reason we have designed and implemented special, very simple GUI protocol that is exposed to other AppVMs (instead of e.g. using the X protocol or VNC). But if we wanted to add some more "features", such as 3D hardware acceleration for the apps (3D acceleration is already available to the Window Manager in Qubes, but not for the apps), then we would not be able to keep the GUI protocol so simple anymore, and this might result in introducing exploitable fatal bugs. So, in that case it would be great to have untrusted GUI domain, because we would be able to provide feature-rich GUI protocols, with all the OpenGL-ish like things, without worrying that somebody might exploit the GUI backend. We would also not need to worry about putting all the various 3rd party software in the GUI domain, such as KDE, Xorg, and various 3rd party GPU drivers, like e.g. NVIDIA's closed source ones, and that some of it might be malicious.

So, generally, yes, we would like to have untrusted GUI domain - we can live without it, but then we will not have all the fancy 3D acceleration for games, and also need to carefully choose and verify the GUI-related software (which is lots of software).

But perhaps in the next 5 years everybody will have a computer with a few dozens of cores, and also the CPU-to-DRAM bandwidth will be orders of magnitude faster than today, and so there will be no longer a need to offload graphic intensive work to a specialized GPU, because one of our 64 cores will happily do the work? Wouldn't that be a nicer architecture, also for many other reasons (e.g. better utilization of power/circuit real estate)? In that case nobody will need OpenGL, and so there will be no need for a richer GUI protocol than what is already implemented in Qubes...

It's quite exciting to see what will happen (and what we will come up for Qubes) :)

BTW, some people might confuse X server de-privileging efforts, i.e. making the X server run without root privileges, which is being done in some Linux distros and BSDs, with what had been described in this article, namely making the GUI subsystem untrusted. Please note that a de-priviliged X server doesn't really solve any major security problems related to GUI subsystem, as whoever controls ("0wns") the X server (depriviliged or not) can steal or manipulate all the data that this X server is processing/displaying. Apparently there are some reasons why people want to run Xorg as non-root, but in case of typical desktop OSes this provides little security benefit (unless you want to run a few X servers with different user accounts, and on different vt's, which most people would never do anyway).


Anonymous said...

Hello joahanna,

you made me remember about 8 1/2 .
thi is an example of "window manager" in userspace; you get access to a pixmap, a keyboard and a mouse and you can delegate the handling of part of that to another process.

this is done in a "real" os that you can boot on a pc (or at least a virtual machine).

do you know about the security system of plan9? what do you think about it?

Joanna Rutkowska said...


What specific aspect of 812 does my article made you remember of?

Anonymous said...

did you get a chance to talk with David about the somewhat neglected trusted IO/ trusted sprite model in Trusted computing?

Max Tiktin said...

hi, Joanna

sorry if im posting to an unapporpriate post :) but still..
its the most recent one.

i've read some of the interesting things you've posted, especially security trends (3 of them). seen Qubes.

what do you think about ThinClients as a security concept (encrypted env, ipsec etc)? (security by secure network?)

and what do you think about total separation of environments (physical security)?

there were also concepts of anonimization, have you any thoughts on this ? what about anonymizing the internet via anonimization protocols, so that the traditional "traffic" will no longer exist?

with all your mastery with the Rings do you think traditional OS will survive in the Internet ?
maybe Internet OS is better for public use? and good-old-os like windows and linux along with mac should be out of Internet connection :)

any philosophical stuff you provide will be highly appreciated,


Joanna Rutkowska said...


The question about anonimization is off-topic, so skipping it.

Martim said...

Interesting thoughts.

But if the GUI is assumed untrusted, then indeed an attacker can completely fake what the user sees. Think of the implication that this could have for security notifications.

Perhaps the ideal option would be to basically have two GUI subsystems: an untrusted, fully-featured one for general use and a very simple, trusted one (which could be formally verified) which can act as a communication channel between the trusted computing base and the user. The videocard framebuffer could then be partitioned between these two GUIs by having the hypervisor virtualize access to it.

Almost sounds like the "trusted path" concept from the Orange Book...

Anonymous said...

another paper you should have already read:

A Nitpicker’s guide to a minimal-complexity secure GUI:

Joanna Rutkowska said...

Yes, I read about Nitpicker, and I even tried their Live CD, as we have considered its use for Qubes at the very beginning. Unfortunately Nitpicker doesn't really solve any of the problems I described in the article, and has more disadvantages than the current Qubes GUI implementation. Specifically:

1) It doesn't provide a common Window Manager for all VMs -- instead it just lets each VM to use its own manager. This results in a bug mess on the desktop, and IMO is really an unacceptable solution in any production-quality desktop environment.

2) It uses their own X-like server (very small though) and its own drivers -- this approach doesn't scale to commercial hardware. Of course, they don't make use of H/W acceleration at all.

3) Even if they finally implemented H/W acceleration for their trusted X-server, still there is a problem of how to *securely* expose this H/W acceleration to all the apps in the VMs (in terms of avoiding exploitable bugs in the backend).

So, all in all, my impression was that it was (is?) more of an academic proof-of-concept project, with little practical applications. It's main advantage being the very small code base, covering not only the GUI backend, but also the X-like server and VESA drivers. Quite impressive in this respect.

Hal Finney said...

I wanted to ask a question on the Qubes post but comments weren't allowed.

Recently there has been discussion of Internet voting. The conventional wisdom is that no mass market software could be secure enough to serve as the foundation of our democracy. What do you think? Could Qubes evolve to where it would be secure enough to serve as a voting client?

And how about attestation to prove that a secure VM is being used, could/should that be part of the solution?

Joanna Rutkowska said...

@Hal: I don't think you need something as complex as Qubes for just a voting software. I can very well imagine a one-purpose OS and software stack being distributed on CDs/USBs, and the users would be required to shut-down their normal systems and boot from this "voting OS" in order to vote. Because it's a rather rare activity, it's perfectly reasonably to require users to reboot for voting.

Of course, for this to work, a reasonably secure trusted boot technology is needed, and the only one that comes to mind is Intel TXT. But even TXT is not there yet -- we need TXT that would be immune to SMM attacks (anybody saw an STM already?), ACPI attacks (is Intel gonna do something about it?), and perhaps with more open SINIT code (would you be there are no more integer overflows in Intel's code?). But that seems like a way to go in the long run.

Anonymous said...

You could use Xen's multi GPU passthrough support for this. Each VM that needs 3D could get a 3D GPU dedicated to it. The VM driver would use the adapter in the standard way and render whatever it wants to a framebuffer which would then be passed to the GUI VM. This driver may be non-trivial but it may not be more difficult than passing video.

I expect newer CPU/GPU like AMD's fusion to support virtualization of the GPU functions directly. Of course, Intel could resurrect their 80 cores CPU/GPU too and switch everything to ray-tracing but this seems doubtful right now.

Nicolas Wagrez

Christian said...

I stumbled upon this blog post via Google - so sorry for commenting so late...

Joanna, are you aware of the dissertation of the original author of Nitpicker? Also, the work on Nitpicker kept moving and you may get an impression here.

Besides, I'm curious why you think: [Using the VMs window managers] results in a bug mess on the desktop.