Intel Software Guard Extensions (SGX)
might very well be The Next Big Thing coming to our industry, since
the introduction of Intel VT-d, VT-x, and TXT technologies in the
previous decade. It apparently seem to promise what so far has never
been possible – an ability to create a secure enclave
within a potentially compromised OS. It sounds just too great, so I
decided to take a closer look and share some early thoughts on this
technology.
Intel
SGX – secure enclaves
within untrusted
world!
Intel SGX is an upcoming technology,
and there is very little public documents about it at the moment. In
fact the only public papers and presentations about SGX can be found
in the agenda of one security workshop that took place some two
months ago.
The three papers from Intel engineers presented there provide a reasonably good
technical introduction to those new processor extensions.
You might think about SGX as of a next
generation of Intel TXT – a technology that has never really took
off, and which has had a long history of security problems disclosed
by certain team of researchers ;) Intel TXT has also been perhaps the
most misunderstood technology from Intel – in fact many people
thought about TXT as if it already could provide security enclaves
within untrusted OS – this however was not really true (even
ignoring for our multiple attacks) and I have spoke and wrote many
times about that in the past years.
It's not clear to me when SGX will make
it to the CPUs that we could buy in local shops around the corner. I
would be assuming we're talking about 3-5 years from now, because the
SGX is not even described in the Intel SDM at this moment.
Intel SGX is essentially a new mode of
execution on the CPU, a new memory protection semantic, plus a couple
of new instructions to manage this all. So, you create an enclave by
filling its protected pages with desired code, then you lock it down,
measure the code there, and if everything's fine, you ask the
processor to start executing the code inside the enclave. Since now
on, no entity, including the kernel (ring 0) or hypervisor (ring
“-1”), or SMM (ring “-2”) or AMT (ring “-3”), has no
right to read nor write the memory pages belonging to the enclave.
Simple as that!
Why have we had to wait so long for
such technology? Ok, it's not really that simple, because we need
some form of attestation or sealing to make sure that the enclave was
really loaded with good code.
The cool thing about an SGX enclave is
that it can coexist (and so, co-execute) together with other code,
such all the untrusted OS code. There is no need to stop or pause the
main OS, and boot into a new stub mini-OS, like it was with the TXT
(this is what e.g. Flicker tried to do, and
which was very clumsy). Additionally, there can be multiple enclaves,
mutually untrusted, all executing at the same time.
No more stinkin' TPMs nor BIOSes to
trust!
A nice surprise is that SGX
infrastructure no longer depends on the TPM to do measurements,
sealing and attestation. Instead Intel has a special enclave that
essentially emulates the TPM. This is a smart move, and doesn't
decrease security in my opinion. It surely makes us now trust only
Intel vs. trusting Intel plus some-asian-TPM-vendor. While it might
sound like a good idea to spread the trust between two or more
vendors, this only really makes sense if the relation between
trusting those vendors is expressed as “AND”, while in this case
the relation is, unfortunately of “OR” type – if the private EK
key gets leaked from the TPM manufacture, we can bypass any remote
attestation, and no longer we need any failure on the Intel's side.
Similarly, if Intel was to have a backdoor in their processors, this
would be just enough to sabotage all our security, even if the TPM
manufacture was decent and played fair.
Because of this, it's generally good
that SGX allows us to shrink the number of entities we need to trust
down to just one: Intel processor (which, these days include the CPUs
as well as the memory controller, and, often, also a GPU). Just to
remind – today, even with a sophisticated operating system
architecture like those we use in Qubes OS, which is designed with
decomposition and minimizing trust in mind, we still need to trust
the BIOS and the TPM, in addition to the processor.
And, of course, because SGX enclaves
memories are protected against any other processor mode's access, so
SMM backdoor no longer can compromise our protected code (in contrast
to TXT, where SMM can subvert
a TXT-loaded hypervisor), nor any other entity, such as the
infamous AMT, or malicious GPU, should be able to do that.
So, this is all very good. However...
Secure Input and Output (for Humans)
For any piece of code to be somehow
useful, there must be a secure way to interact with it. In case of
servers, this could be implemented by e.g. including the SSL endpoint
inside the protected enclave. However for most applications that run
on a client system, ability to interact with the user via screen and
keyboard is a must. So, one of the most important questions is how
does Intel SGX secures output to the screen from an SGX enclave, as
well as how does it ensure that the input the enclave gets is indeed
the input the user intended?
Interestingly, this subject is not very
thoroughly discussed in the Intel papers mentioned above. In fact
only one paper briefly mentions Intel Protected Audio Video Path
(PVAP) technology that apparently could be used to provide secured
output to the screen. The paper then references... a consumer FAQ onBlueRay Disc Playback using Intel HD graphics. There is no further
technical details and I was also unable to find any technical
document from Intel about this technology. Additionally this same
paper admits that, as of now, there is no protected input
technology available, even on prototype level, although they promise
to work on that in the future.
This might not sound very surprising –
after all one doesn't need to be a genius to figure out that the main
driving force behind this whole SGX thing is the DRM, and
specifically protecting Holywwod media against the pirate industry.
This would be nothing wrong in itself, assuming, however, the
technology could also have some other usages, that could really
improve security of the user (in contrast to the security of the
media companies).
We shall remember that all the secrets,
keys, tokens, and smart-cards, are ultimately to allow the user to
access some information. And how does people access information? By
viewing in on a computer screen. I know, I know, this so retro, but
until we have direct PC-brain interfaces, I'm afraid that's the only
way. Without properly securing the graphics output, all the secrets
can be ultimately leaked out.
Also, how people command their
computers and applications? Well, again using this retro thing called
keyboard and mouse (touchpad). However secure our enclave might be,
without secured input, the app would not be able to distinguish
intended user input from simulated input crafted by malware. Not to
mention about such obvious attacks as sniffing of the user input.
Without protected
input and output, SGX might be able to stop the malware from stealing
the user's private keys for email encryption or issuing bank
transactions, yet the malware will still be able to command this
super-secured software to e.g. decrypt all the user emails and later
steal the screenshots of all the plaintext messages (with a bit of
simple programming, the screenshot's could be turned back into nice
ASCII text for saving on bandwidth when leaking them out to a server
in Hong Kong), or better yet, perhaps just forward them to an email
address that the attacker controls (perhaps still encrypted, but
using the attackers key).
But, let's ignore
for a moment this “little issue” of lack of protected input, and
lack of technical documentation on how secure graphics output is
really implemented. Surely it is thinkable that protected input and
output could be implemented in a number of ways, and so let's hope
Intel will do it, and will do right. We should remember here, that
whatever mechanism Intel is going to use to secure the graphics and
audio output, it surely will be an attractive target of attacks, as
there is probably a huge money incentive for such attacks in the film
illegal copying business.
Securing mainstream client
OSes and why this is not so simple?
As mentioned
above, for SGX enclaves to be truly meaningful on client systems we
need protected input and output, to and from the secured enclaves.
Anyway, lets assume for now that Intel has come up with robust
mechanisms to provide these. Let's now consider further, how SGX
could be used to turn our current mainstream desktop systems into
reasonably secure bastions.
We start with a
simple scenario – a dedicated application for viewing of incoming
encrypted files, say PDFs, performing their decryption and signature
verification., and displaying of the final outcome to the user (via
protected graphics path). The application takes care about all the
key management too. All this happens, of coruse, inside an SGX
enclave(s).
Now, this sounds
all attractive and surely could be implemented using the SGX. But
what about if we wanted our secure document viewer to become a bit
more than just a viewer? What if we wanted a secure version of MS
Word or Excel, with its full ability to open complex documents and
edit them?
Well
it's obviously not enough to just put the proverbial
msword.exe into a SGX
enclave. It is not, because the msword.exe makes use
of million of other things
that are provided by the OS and
3rd
libraries, in order to perform all sorts of tasks it is supposed to
do. It is not a straightforward decision to draw a line between
those parts that are security sensitive and those that are not. Is
font parsing security critical? Is drawing proper labels on GUI
buttons and menu lists security critical? Is rendering of various
objects that are part of the (decrypted) document, such as pictures,
security critical? Is spellchecking security critical? Even if the
function of some of a subsystem seem not security critical (i.e. not
allows to easily leak
the plaintext document out of the enclave), let's not forget that all
this 3rd
party code would be interacting very closely with the
enclave-contained code. This means the attack surface exposed to all
those untrusted 3rd
party modules will be rather huge. And we already know it is rather
not possible to write a renderer
for such complex documents as PDFs, DOCs, XLS, etc, without
introducing tons of exploitable bugs.
And these attack are not
coming now from the potentially malicious documents (against those
we protect, somehow, by parsing only signed document from trusted
peers), but are coming from the compromised OS.
Perhaps
it would be possible to take Adobe Reader, MS Word, Powerpoint, Excel
etc, and just rewrite every of those apps from scratch in a way that
they were
properly decomposed into sensitive parts that execute within SGC
enclave(s), and those that are not-sensitive and make use of all the
OS-provided
functionality, and further define clean and simple interfaces between
those parts, ensuring the “dirty” code cannot exploit the
sensitive code. Somehow
attractive, but somehow I don't see this happening anytime soon.
But, perhaps, it
would be easier to do something different – just take the whole
msword.exe, all the DLLs it depends on, as well as all the OS
subsystems it depends on, such as the GUI subsystem, and put all of
this into an enclave. This sounds like a more rational approach, and
also more secure.
Only
notice one thing – we just created... a Virtual Machine with
Windows OS inside and the msword.exe that uses this Windows OS..
Sure, it is not a VT-x-based VM, it is an SGX-based VM now, but it is
largely the same animal!
Again,
we came to the conclusion why the use of VMs is suddenly perceived as
such an increase in security (which some people cannot get, claiming
that introducing VM-layer only increases complexity) – the use of
VMs is profitable because of
one of thing: it suddenly
packs all the fat libraries- and OS-exposed APIs and subsystems into
one security domain, reducing all the interfaces between this code in
the VM and the outside world. Reducing of the interfaces between two
security domains is ALWAYS desirable.
But our
SGX-isolated VMs have one significant advantage over the other VM
technologies we got used to in the last decade or so – namely those
VMs can now be impenetrable to any other entity outside of the VM. No
kernel or hypervisor can peek into its memory. Neither can the SMM,
AMT, or even a determined physical attacker with DRAM emulator,
because SGX automatically encrypts any data that leave the processor,
so everything that is in the DRAM is encrypted and useless to the
physical attacker.
This
is a significant achievement.
Of course SGX, strictly
speaking, is not a (full)
virtualization technology, it's
not going to replace VT-x..
But remember we don't always need full virtualization, like VT-x,
often we can use paravirtualization and all we need in that case is a
good isolation technology. For
examaple, Xen uses
paravirtualization
for Linux-based PV VMs,
and uses
good-old ring3/ring0 separation mechanism to
implement this, and the
level of isolation
of such PV
domains on Xen is comparable to the isolation of HVMs, which are
virtualized using VT-x.
To Be Continued
In
the next part of this article, we will look into some interesting
unconventional uses of SGX, such as creating malware that cannot be
reversed engineered, or TOR nodes or Bitcoin mixers that should be
reasonably trusted, even if we don't trust their
operators.
Then we will discuss how SGX might profoundly change the architecture
of the future operating systems, and virtualization systems, in a way
that we will no longer need to trust (large portions of) their
kernels or hypervisors, or system admins (Anti Snowden Protection?)
And, of course, how our Qubes OS might embrace this technology in the
future.
Finally, we should
discuss the important issue of whether this whole SGX, while
providing many great benefits for system architects, should really be
blindly trusted? What are the chances of Intel building in backdoors
there and exposing those to the NSA? Is there any difference in
trusting Intel processors today vs. trusting the SGX as a basis of
security model of all software in the future?