Monday, March 26, 2007

The Game Is Over!

People often say that once an attacker gets access to the kernel the game is over! That’s true indeed these days, and most of the research I have done over the past two years or so, was about proofing just that. Some people, however, go a bit further and say, that thus there is no point in researching ways to detect system compromises and, once an attacker got in, you should simply assume everything has been compromised and replace all the components, i.e. buy new machine (as the attacker might have modified the BIOS or re-flashed PCI EEPROMs), reinstall OS, all applications, etc.



However, they miss one little detail – how can they actually know that the attacker got access to the system and that the game is over indeed and we need to reinstall just now?

Well, we simply assume that the attacker had to make some mistake and that we, sooner or later, will find out. But what if she didn’t make a mistake?

There are several trends of how this problem should be addressed in a more general and elegant way though. Most of them are based on a proactive approach. Let’s have a quick look at them…
  1. One generic solution is to build in a prevention technology into the OS. That includes all the anti-exploitation mechanisms, like e.g. ASLR, Non Executable memory, Stack Guard/GS, and others, as well as some little design changes into OS, like e.g. implementation of least-privilege principle (think e.g. UAC in Vista) and some sort of kernel protection (e.g. securelevel in BSD, grsecurity on Linux, signed drivers in Vista, etc).

    This has been undoubtedly the most popular approach for the last couple of years and recently it gets even more popular, as Microsoft implemented most of those techniques in Vista.

    However, everybody who follows the security research for at least several years should know that all those clever mechanisms have all been bypassed at least once in their history. That includes attacks against Stack Guard protection presented back in 2000 by Bulba and Kil3r, several ways to bypass PaX ASLR, like those described by Nergal in 2001 and by others several months later as well as exploiting the privilege elevation bug in PaX discovered by its author in 2005. Also the Microsoft's Hardware DEP (AKA NX) has been demonstrated to be bypassable by skape and Skywing in 2005.

    Similarly, kernel protection mechanisms have also been bypassed over the past years, starting e.g. with this nice attack against grsecurity /dev/(k)mem protection presented by Guillaume Pelat in 2002. In 2006 Loic Duflot demonstrated that BSD's famous securelevel mechanism can also be bypassed. And, also last year, I showed that Vista x64 kernel protection is not foolproof either.

    The point is – all those hardening techniques are designed to make exploitation harder or to limit the damage after a successful exploitation, but not to be 100% foolproof. On the other hand, it must be said, that they probably represent the best prevention solutions available for us these days.

  2. Another approach is to dramatically redesign the whole OS in such a way that all components (like e.g. drivers and serves) are compartmentalized, e.g. run as separate processes in usermode, and consequently are isolated not only from each other but also from the OS kernel (micro kernel). The idea here is that the most critical components, i.e. the micro kernel, is very small and can be easily verified. Example of such OS is Minix3 which is still under development though.

    Undoubtedly this is a very good approach to minimize impact from system or driver faults, but does not protect us against malicious system compromises. After all if an attacker exploits a bug in a web browser, she may only be interested in modifying the browser’s code. Sure, she probably would not be able to get access to the micro kernel, but why would she really need it?

    Imagine, for example, the following common scenario: many online banking systems require users to use smart cards to sign all transaction requests (e.g. money transfers). This usually works by having a browser (more specifically an ActiveX control or Firefox’s plugin) to display a message to a user that he or she is about to make e.g. a wire transfer to a given account number for a given amount of money. If the user confirms that action, they should press an ‘Accept’ button, which instructs browser to send the message to the smart card for signing. The message itself is usually just some kind of formatted text message specifying the source and destination account numbers, amount of money, date and time stamp etc. Then the user is asked to insert the smart card, which contains his or her private key (issued by the bank) and to also enter the PIN code. The latter can be done either by using the same browser applet or, in slightly more secure implementations, by the smart card reader itself, if it has a pad for entering PINs.

    Obviously the point here is that malware should not be able to forge the digital signature and only the legitimate user has access to the smart card and also knows the card’s PIN, so nobody else will be able to sign that message with the user’s key.

    However, it’s just enough for the attacker to replace the message while it’s being send to the card, while displaying the original message in the browser’s window. This all can be done by just modifying (“hooking”) the browser’s in-memory code and/or data. No need for kernel malware, yet the system (the browser more specifically) is compromised!


    Still, one good thing about such a system design is that if we don’t allow an attacker to compromise the microkernel, then, at least in theory, we can write a detector capable of finding that some (malicious) changes to the browsers memory have been introduced indeed. However, in practice, we would have to know how exactly the browser’s memory should look like, e.g. which function pointers in Firefox’s code should be verified in order to find out whether such a compromise has indeed occurred. Unfortunately we can not do that today.

  3. Alternative approach to the above two, which does not require any dramatic changes into OS, is to make use of so called sound static code analyzers to verify all sensitive code in OS and applications. The soundness property assures that the analyzer has been mathematically proven not to miss even a single potential run time error, which includes e.g. unintentional execution flow modifications. The catch here is that soundness doesn’t mean that the analyzer doesn’t generate false positives. It’s actually mathematically proven that we can’t have such an ideal tool (i.e. with zero false positive rate), as the problem of analyzing all possible program execution paths is incomputable. Thus, the practical analyzers always consider some superset of all possible execution flows, which is easy to compute, yet may introduce some false alarms and the whole trick is how to choose that superset so that the number of false positives is minimal.

    ASTREE is an example of a sound static code analyzer for the C language (although it doesn’t support programs which make use of dynamic memory allocation) and it apparently has been used to verify the primary flight control software for Airbus A340 and A380. Unfortunately, there doesn’t seem to be any publicly available sound binary code static analyzers… (if anybody knows any links, you’re more then welcome to paste the links under this post – just please make sure you’re referring to sound analyzers).

    If we had such sound and precise (i.e. with minimal rate of false alarms) binary static code analyzer that could be a big breakthrough in the OS and application security.

    We could imagine, for example, a special authority for signing device drivers for various OSes and that they would first perform such a formal static validation on submitted drivers and, once passed the test, the drivers would be digitally signed. Plus, the OS kernel itself would be validated itself by the vendor and would accept only those drivers which were signed by the driver verification authority. The authority could be an OS vendor itself or a separate 3rd party organization. Additionally we could also require that the code of all security critical applications, like e.g. web browser be also signed by such an authority and set a special policy in our OS to allow e.g. only signed applications to access network.

    The only one week point here is, that if the private key used by the certification authority gets compromised, then the game is over and nobody really knows that… For this reason it would be good, to have more then one certification authority and require that each driver/application be signed by at least two independent authorities.

From the above three approaches only the last one can guarantee that our system will not get compromised ever. The only problem here is that… there are no tools today for static binary code analysis that would be proved to be sound and also precise enough to be used in practice…

So, today, as far as proactive solutions are considered, we’re left only with solutions #1 and #2, which, as discussed above, can not protect OS and applications from compromises in 100%. And, to make it worse, do not offer any clue, whether the compromise actually occurred.

That’s why I’m trying so much to promote the idea of Verifiable Operating Systems, which should allow to at least find out (in a systematic way) whether the system in question has been compromised or not (but, unfortunately not to find whether the single-shot incident occurred). The point is that the number of required design changes should be fairly small. There are some problems with it too, like e.g. verifying JIT-like code, but hopefully they can be solved in the near feature. Expect me to write more on this topic in the near feature.

Special thanks to Halvar Flake for eye-opening discussions about sound code analyzers and OS security in general.

14 comments:

Anonymous said...

And what about MS Singularity, which is 90% managed code, and each driver and program gets compiled during install and verified?

Anonymous said...

I found your description of "sound static analysis" interesting. I'm the CTO of Klocwork, a company that provides static analysis tools, and have experienced both sound and pragmatic approaches to this problem.

The challenge with provable scenarios is execution time and general scalability. Proving correctness for an embedded system of 50K LOC is a very different proposition than for a 1M+ LOC modern OS.

Pragmatic approaches, while considering a considerably larger superset of execution paths (by definition), and which therefore have relatively typical precision/coverage challenges, do provide reasonable and valid guidance to a human observer, as opposed to attempting to bypass that observer by producing the "right answer" at all times.

Given that this domain is subject to the halting problem, I would question the validity of any approach to system validation that didn't include an expert observer, frankly.

Finally, whilst one day (rose tinted moment) we will hopefully be able to sit back and ask our compilers not only to tell us where we made syntax mistakes but also where we are subject to security vulnerabilities and design flaws, in the meantime I can't help thinking that any solution (sound or pragmatic) is better than nothing.

Binary validation is another matter entirely and strikes me as somebody looking to cure symptoms rather than root causes (but then, I would think that, wouldn't I ;p).

Anonymous said...

Maybe some day a quantum algorithm will be able to do it right and fast.
What about typed assembly language? Or any language designed directly for OS and Driver development other than C?

Unknown said...

You might want to take a look at Secure64. (http://www.secure64.com) It has a secure operating system that offers compartmentalization and protection ID. It's the only operating system I know of that not only works, but can also support root trust authentication for the entire code base and subsequent applications.

Peter

denis bider said...

I second the comment by David Karnok. Binary verification can be achieved by replacing the x86/x64 instruction set with one that is more amenable to security analysis. Then only the compiler from intermediate binary to machine binary format, and a minimal run-time component, need to be trusted.

Generally though, security analysis can never be fully automated because it can only ever detect accidental faults, not essential faults. Accidental faults are those where an algorithm is misimplemented; essential faults are those where the algorithm itself is wrong. We may eventually get to the point where we can code exactly what we mean and thus eliminate all accidental faults, but we will never eliminate all the possible faults in our thinking.

We can systematically remove a large class of faults, but another large class of faults will remain.

Cd-MaN said...

When you are "rooted", its pretty much game over because:

-all the available "scanners" can be bypassed (because there is no inherent guarantee - like the Ring0 / Ring3 divide - provided by an external party - like the OS or the hardware)

-the people who would be able to do a competent security analysis are few and far between (I don't mean this to be an insult, I just want to state the fact that a very low percentage of the security professionals has the skills of doing reverse engeneering on the level which is required to trace kernel malware).

The only option for small companies and home users is to wipe / reinstall / patch / pray and only large corporations can afford to hire a skilled third party (but most probably they don't do it many times).

As for the security measures like ASLR or stack canaries: I agree that all of them can be bypassed, but the question is how much of a reduction in exploitation probability do they provide? For example if I have an exploit which is 95% effective against systems with none of these protections and only 5% effective against systems with these protections activated, would you use them? Of course you would! In the end (for businesses and most people) security is not a goal in itself, it is a part of the everyday risk, and a solution which reduces a particular risk almost 20 times is very good one!

Joanna Rutkowska said...

Dear cdman83,

in order to be able to, as you said "wipe/reinstall/patch" or even to "hire a skilled third party [to investigate the incident]" you need to first know that something has happened and that you need to take some action. But without detection we simply can not know that something wrong has happened, so it's a vicious circle!

Or, maybe, we should start relying on probability, as you suggest, and just say that e.g. "this server is compromised with a probability of p=1-0.95^n, where would be some magic parameter symbolizing the number of exploitation approaches (following your model). And then, of course, we will assume that once p is greater then some threshold value we need to "reinstall"... I even have a sexy name for this: "quantum security"! :)

[This is bullshit]

Anonymous said...

Joanna,

Better than setting up some anticipated intrusion algorithm, why not just wipe,clean, reinstall every Nth moment in RAM...

who needs a "bloody" hard drive anyway...

Personally, I side with the initial premise
"game is never over"

good work

Cd-MaN said...

As somebody living in the real world (and I mean this not be an offense, just wanting to state that I'm doing a more hands-on approach while you are using a more research approach, which could cause our opinions to differ), I want to say that 95% of all malware out there is very primitive (in the sense that it's all usermode code, sometimes with big glaring mistakes - like assuming that Windows is installed in C:\Windows and not calling GetWindowsDirectory :)), an other 4.99% starts to get smart (but still relies on techniques which were known for a - relatively - long time) and less than 0.01% are doing something innovative.

Now admittedly it may be the case that we don't see those "uber-malware" because it hides itself so well, but I still stand by my opinion that there is a very low risk associated with it.

I said it and I say it again: there is no such thing as "perfect security". Security is a process. A numbers game. Can you be 100% sure that your machine wasn't rooted? No. But you can get pretty close to it by using layered security. And pretty close if good enough for most of the people. Or to state it an other way: as long as a huge percentage of the users / companies uses such lax security practices that malware such as mentioned in the example above (which assumes that Windows is installed in c:\Windows) people / companies who have allocated adequate resources to security can sleep well, because the "bad guys" will go after the easy targets first. There is no reason to loose sleep above theoretical attacks, imho.

PS. I would like to state it again that your work is very interesting and this comment isn't meant to be dismissive of it. I just want to clarify that in "the real world" the current security measures (while far from perfect) are good enough - I just wish that everybody would apply them.

Anonymous said...

interesting post. the freebsd project had a google summer-of-code project in either 2005 or 2006 called "securemines", meant to implement something that resembles what you're talking about (a way to know when a foreign intruder is messing with the system).

I don't think it ever got written, but the evolution of it was to create a framework for writing intrusion detection modules.

perhaps it's time to look into it? ;)

Anonymous said...

It seems one of the reoccurring problems in detecting any kind of comprimization is:
How can a bios / kernel / OS detect his own comprimisation?

The answer to this question is in my opinion very similar to the answer to the following question:
How can a fool detect his own foolishness?

Short answer; he can't.

Looking at this problem in a hierarchic structure, I think there can be only one 100% tight solution.

A hierarchic structure which consists of a flawless static top (comparible with a company structure that has an utopic flawless director with utopic flawless managers that don't depend on anything they control nor share the same resources)

This way the board should be able to detect any kind of flaw.

Question in that case is; what is the approach when a flaw is detected an thus the system is potentially comprimised?

Reboot??
(I wouldn't install Windows in that case)

Q said...

I can guarantee 100% that my system is hack-proof.

It is not accessible to hackers.

With that in mind, I can almost as well guarantee the same even if my system were connected to the Internet, behind at least two firewalls/IPS, and all unnecessary services turned off. Why should I worry about theoretical rootkit installations, when undoubtedly you'd need local access? Or maybe you would like me to install something for you first?

Ancient said...

Q: "I can guarantee 100% that my system is hack-proof. It is not accessible to hackers."

I see. you obviously have some restricted definition of the term 'hacker' which precludes defeating alarm systems, tampering locks and other B&E. How about tempest, perhaps you should place said system in a faraday cage or perhaps even better, just secure-format it and remove the target completely.

The truth is that if you have a USEABLE system, it IS crackable... and this should not be restricted to only evaluating remote intrusions or standard exploit methodologies as you understand them.

But lets accept your definition. Do you stay up to date with your patchlevels? Yes? On CD from the manufacturer? In which case I have an attack vector. Using automatic updates? If so, I have an attack vector.

If you would care to make a sizeable wager on the security of your system I'd welcome the opportunity to take you up on it. After all, even an unconnected system in a room is only as secure as the room itself.

And, if there are passcodes that exist SOLEY in your head, well, I garauntee that for the right fee I'd get those too... entered or not. *IF* you allow the me, the attacker, carte-blanche as in a real scenario. However, you may not like that.

Q: "With that in mind, I can almost as well guarantee the same even if my system were connected to the Internet, behind at least two firewalls/IPS, and all unnecessary services turned off. Why should I worry about theoretical rootkit installations, when undoubtedly you'd need local access? Or maybe you would like me to install something for you first?"

Think OUTSIDE the box Q. You're making some pretty sweeping assumptions there. Firewalls are not the key issue here nor is IDS. Much of your traffic is tamperable and much of it the FW is set up to allow.

An upstream transparent node placed at a telco connection point, for example, can often absolutely wreck the most rigid security schemas. These exist even for modern DMT based comms and cover everything from ISDN to FrameRelay and xDSL to SONET.

Such devices can often turn their victimes strengths against themselves - particularly when it comes to rigourous patchlevels.

I have a funny feeling that if I were to challenge you on this at a convention you'd lay down many restrictions such as 'Must be a remote exploit' and 'cannot place hardware upstream' ... or perhaps insist on strict time limits or other arbitrary obstacles such as must consider the installation location as inviolable. Of course... all things that YOU DO NOT CONTROL in the real world.

But... if you were to offer a 100,000 prize for cracking your box at a 7 day security conference. And you placed that box in a secure room with a modern alarm system and PIRs well, I can pretty much garauntee the data will come off. Connected or not.

And if its encrypted? Well... you'd perhaps be required to regularly access the data over the course of the conference. You know, just to give a fighting chance and demonstrate that we are talking about a system which you are unafraid to access.

But for a cool 100,000 and 7 days I can see enthusiastic attempts to place radio-cameras in your PIRs, serial bugs in your keyboard and even some overly-enthusiastic profiling your keypress audio.

If online I'd expect your uplink to get spliced too - watching for common apps and OS components performing auto-update checks and subverting them.

Seriously, if you want to posit the security of a system NOT connected to the internet then you'd have to allow the attacker carte-blanche.

But, if the price is right, I garauntee you'll lose that wager.

I've performed full-blackhat pen-tests before and I can safely say that no system I've seen has ever came off as adequately secure.

Indeed, when working corporate espionage jobs there are SO many alternative avenues of attack that the idea of securing a system by unplugging it (PTP Security) is just nonsense.

Anonymous said...

Of course, being being two firewalls/ids kinda means you ARE using the network. That being the case you're vulnerable - period!

So, forget about services... turn 'em all off. They actually carry more rigourous security than clients which can be a far simpler vector.

So, a client exploit and some escalation code and you CAN get rootkitted.

The poster who says his system is secure has a very simplistic view of security and is making way too many assumptions.