Intel Software Guard Extensions (SGX) might very well be The Next Big Thing coming to our industry, since the introduction of Intel VT-d, VT-x, and TXT technologies in the previous decade. It apparently seem to promise what so far has never been possible – an ability to create a secure enclave within a potentially compromised OS. It sounds just too great, so I decided to take a closer look and share some early thoughts on this technology.
Intel SGX – secure enclaves within untrusted world!
Intel SGX is an upcoming technology, and there is very little public documents about it at the moment. In fact the only public papers and presentations about SGX can be found in the agenda of one security workshop that took place some two months ago. The three papers from Intel engineers presented there provide a reasonably good technical introduction to those new processor extensions.
You might think about SGX as of a next generation of Intel TXT – a technology that has never really took off, and which has had a long history of security problems disclosed by certain team of researchers ;) Intel TXT has also been perhaps the most misunderstood technology from Intel – in fact many people thought about TXT as if it already could provide security enclaves within untrusted OS – this however was not really true (even ignoring for our multiple attacks) and I have spoke and wrote many times about that in the past years.
It's not clear to me when SGX will make it to the CPUs that we could buy in local shops around the corner. I would be assuming we're talking about 3-5 years from now, because the SGX is not even described in the Intel SDM at this moment.
Intel SGX is essentially a new mode of execution on the CPU, a new memory protection semantic, plus a couple of new instructions to manage this all. So, you create an enclave by filling its protected pages with desired code, then you lock it down, measure the code there, and if everything's fine, you ask the processor to start executing the code inside the enclave. Since now on, no entity, including the kernel (ring 0) or hypervisor (ring “-1”), or SMM (ring “-2”) or AMT (ring “-3”), has no right to read nor write the memory pages belonging to the enclave. Simple as that!
Why have we had to wait so long for such technology? Ok, it's not really that simple, because we need some form of attestation or sealing to make sure that the enclave was really loaded with good code.
The cool thing about an SGX enclave is that it can coexist (and so, co-execute) together with other code, such all the untrusted OS code. There is no need to stop or pause the main OS, and boot into a new stub mini-OS, like it was with the TXT (this is what e.g. Flicker tried to do, and which was very clumsy). Additionally, there can be multiple enclaves, mutually untrusted, all executing at the same time.
No more stinkin' TPMs nor BIOSes to trust!
A nice surprise is that SGX infrastructure no longer depends on the TPM to do measurements, sealing and attestation. Instead Intel has a special enclave that essentially emulates the TPM. This is a smart move, and doesn't decrease security in my opinion. It surely makes us now trust only Intel vs. trusting Intel plus some-asian-TPM-vendor. While it might sound like a good idea to spread the trust between two or more vendors, this only really makes sense if the relation between trusting those vendors is expressed as “AND”, while in this case the relation is, unfortunately of “OR” type – if the private EK key gets leaked from the TPM manufacture, we can bypass any remote attestation, and no longer we need any failure on the Intel's side. Similarly, if Intel was to have a backdoor in their processors, this would be just enough to sabotage all our security, even if the TPM manufacture was decent and played fair.
Because of this, it's generally good that SGX allows us to shrink the number of entities we need to trust down to just one: Intel processor (which, these days include the CPUs as well as the memory controller, and, often, also a GPU). Just to remind – today, even with a sophisticated operating system architecture like those we use in Qubes OS, which is designed with decomposition and minimizing trust in mind, we still need to trust the BIOS and the TPM, in addition to the processor.
And, of course, because SGX enclaves memories are protected against any other processor mode's access, so SMM backdoor no longer can compromise our protected code (in contrast to TXT, where SMM can subvert a TXT-loaded hypervisor), nor any other entity, such as the infamous AMT, or malicious GPU, should be able to do that.
So, this is all very good. However...
Secure Input and Output (for Humans)
For any piece of code to be somehow useful, there must be a secure way to interact with it. In case of servers, this could be implemented by e.g. including the SSL endpoint inside the protected enclave. However for most applications that run on a client system, ability to interact with the user via screen and keyboard is a must. So, one of the most important questions is how does Intel SGX secures output to the screen from an SGX enclave, as well as how does it ensure that the input the enclave gets is indeed the input the user intended?
Interestingly, this subject is not very thoroughly discussed in the Intel papers mentioned above. In fact only one paper briefly mentions Intel Protected Audio Video Path (PVAP) technology that apparently could be used to provide secured output to the screen. The paper then references... a consumer FAQ onBlueRay Disc Playback using Intel HD graphics. There is no further technical details and I was also unable to find any technical document from Intel about this technology. Additionally this same paper admits that, as of now, there is no protected input technology available, even on prototype level, although they promise to work on that in the future.
This might not sound very surprising – after all one doesn't need to be a genius to figure out that the main driving force behind this whole SGX thing is the DRM, and specifically protecting Holywwod media against the pirate industry. This would be nothing wrong in itself, assuming, however, the technology could also have some other usages, that could really improve security of the user (in contrast to the security of the media companies).
We shall remember that all the secrets, keys, tokens, and smart-cards, are ultimately to allow the user to access some information. And how does people access information? By viewing in on a computer screen. I know, I know, this so retro, but until we have direct PC-brain interfaces, I'm afraid that's the only way. Without properly securing the graphics output, all the secrets can be ultimately leaked out.
Also, how people command their computers and applications? Well, again using this retro thing called keyboard and mouse (touchpad). However secure our enclave might be, without secured input, the app would not be able to distinguish intended user input from simulated input crafted by malware. Not to mention about such obvious attacks as sniffing of the user input.
Without protected input and output, SGX might be able to stop the malware from stealing the user's private keys for email encryption or issuing bank transactions, yet the malware will still be able to command this super-secured software to e.g. decrypt all the user emails and later steal the screenshots of all the plaintext messages (with a bit of simple programming, the screenshot's could be turned back into nice ASCII text for saving on bandwidth when leaking them out to a server in Hong Kong), or better yet, perhaps just forward them to an email address that the attacker controls (perhaps still encrypted, but using the attackers key).
But, let's ignore for a moment this “little issue” of lack of protected input, and lack of technical documentation on how secure graphics output is really implemented. Surely it is thinkable that protected input and output could be implemented in a number of ways, and so let's hope Intel will do it, and will do right. We should remember here, that whatever mechanism Intel is going to use to secure the graphics and audio output, it surely will be an attractive target of attacks, as there is probably a huge money incentive for such attacks in the film illegal copying business.
Securing mainstream client OSes and why this is not so simple?
As mentioned above, for SGX enclaves to be truly meaningful on client systems we need protected input and output, to and from the secured enclaves. Anyway, lets assume for now that Intel has come up with robust mechanisms to provide these. Let's now consider further, how SGX could be used to turn our current mainstream desktop systems into reasonably secure bastions.
We start with a simple scenario – a dedicated application for viewing of incoming encrypted files, say PDFs, performing their decryption and signature verification., and displaying of the final outcome to the user (via protected graphics path). The application takes care about all the key management too. All this happens, of coruse, inside an SGX enclave(s).
Now, this sounds all attractive and surely could be implemented using the SGX. But what about if we wanted our secure document viewer to become a bit more than just a viewer? What if we wanted a secure version of MS Word or Excel, with its full ability to open complex documents and edit them?
Well it's obviously not enough to just put the proverbial msword.exe into a SGX enclave. It is not, because the msword.exe makes use of million of other things that are provided by the OS and 3rd libraries, in order to perform all sorts of tasks it is supposed to do. It is not a straightforward decision to draw a line between those parts that are security sensitive and those that are not. Is font parsing security critical? Is drawing proper labels on GUI buttons and menu lists security critical? Is rendering of various objects that are part of the (decrypted) document, such as pictures, security critical? Is spellchecking security critical? Even if the function of some of a subsystem seem not security critical (i.e. not allows to easily leak the plaintext document out of the enclave), let's not forget that all this 3rd party code would be interacting very closely with the enclave-contained code. This means the attack surface exposed to all those untrusted 3rd party modules will be rather huge. And we already know it is rather not possible to write a renderer for such complex documents as PDFs, DOCs, XLS, etc, without introducing tons of exploitable bugs. And these attack are not coming now from the potentially malicious documents (against those we protect, somehow, by parsing only signed document from trusted peers), but are coming from the compromised OS.
Perhaps it would be possible to take Adobe Reader, MS Word, Powerpoint, Excel etc, and just rewrite every of those apps from scratch in a way that they were properly decomposed into sensitive parts that execute within SGC enclave(s), and those that are not-sensitive and make use of all the OS-provided functionality, and further define clean and simple interfaces between those parts, ensuring the “dirty” code cannot exploit the sensitive code. Somehow attractive, but somehow I don't see this happening anytime soon.
But, perhaps, it would be easier to do something different – just take the whole msword.exe, all the DLLs it depends on, as well as all the OS subsystems it depends on, such as the GUI subsystem, and put all of this into an enclave. This sounds like a more rational approach, and also more secure.
Only notice one thing – we just created... a Virtual Machine with Windows OS inside and the msword.exe that uses this Windows OS.. Sure, it is not a VT-x-based VM, it is an SGX-based VM now, but it is largely the same animal!
Again, we came to the conclusion why the use of VMs is suddenly perceived as such an increase in security (which some people cannot get, claiming that introducing VM-layer only increases complexity) – the use of VMs is profitable because of one of thing: it suddenly packs all the fat libraries- and OS-exposed APIs and subsystems into one security domain, reducing all the interfaces between this code in the VM and the outside world. Reducing of the interfaces between two security domains is ALWAYS desirable.
But our SGX-isolated VMs have one significant advantage over the other VM technologies we got used to in the last decade or so – namely those VMs can now be impenetrable to any other entity outside of the VM. No kernel or hypervisor can peek into its memory. Neither can the SMM, AMT, or even a determined physical attacker with DRAM emulator, because SGX automatically encrypts any data that leave the processor, so everything that is in the DRAM is encrypted and useless to the physical attacker.
This is a significant achievement. Of course SGX, strictly speaking, is not a (full) virtualization technology, it's not going to replace VT-x.. But remember we don't always need full virtualization, like VT-x, often we can use paravirtualization and all we need in that case is a good isolation technology. For examaple, Xen uses paravirtualization for Linux-based PV VMs, and uses good-old ring3/ring0 separation mechanism to implement this, and the level of isolation of such PV domains on Xen is comparable to the isolation of HVMs, which are virtualized using VT-x.
To Be Continued
In the next part of this article, we will look into some interesting unconventional uses of SGX, such as creating malware that cannot be reversed engineered, or TOR nodes or Bitcoin mixers that should be reasonably trusted, even if we don't trust their operators. Then we will discuss how SGX might profoundly change the architecture of the future operating systems, and virtualization systems, in a way that we will no longer need to trust (large portions of) their kernels or hypervisors, or system admins (Anti Snowden Protection?) And, of course, how our Qubes OS might embrace this technology in the future.
Finally, we should discuss the important issue of whether this whole SGX, while providing many great benefits for system architects, should really be blindly trusted? What are the chances of Intel building in backdoors there and exposing those to the NSA? Is there any difference in trusting Intel processors today vs. trusting the SGX as a basis of security model of all software in the future?