Saturday, May 28, 2011

(Un)Trusting the Cloud

Everybody loves The Cloud these days, and it is not hard to understand why. When every person owns computers (devices), the cloud is really hard to beat when it comes to syncing all your digital life back and forth between all those devices, and also sharing with your family members, friends, and colleagues at work. From task lists, through calendars, through health & fitness data, to work-related documents. And I'm not even mentioning all the unencrypted email that is out there.

One doesn't need to be especially smart or security conscious to realize how much this might be a threat to security and privacy. How much easier would it be to attack somebody's laptop if I knew precisely in which hotel and when he or she is planning to stay? How much more expensive would my health and life insurance be, if they could get a look at my health and fitness progress? Etc.

But we're willing to sacrifice our privacy and security in exchange for easy of syncing and sharing of our data. We decide to trust The Cloud. What specifically does that mean?

First, it means we trust the particular cloud-based service vendor, such as the provides of our training monitoring app and service. We trust that this vendor is: 1) non-malicious and ethical, and so is not going to sell our private data to some other entity, e.g. insurance company, and 2) that the software written by this vendor is somehow secure, so it would not be easy for an attacker to break into their cloud service and download all the user's data (and then sell to health insurance companies).

Next, we trust the cloud infrastructure provider, such as Amazon EC2. We trust that the cloud provider is 1) non-malicious and ethical, and that they won't really read the memory of the virtual machine on which the previously mentioned cloud-service is running (and won't make it available to a local government officials, e.g. in China), and 2) that they secured their infrastructure properly (e.g. it wouldn't be easy for one customer to “escape” from a VM and read all the memory of the VMs belonging to other customers).

Finally we trust all the infrastructure that is in the middle between us and the service provider, such as e.g. the networking protocols, are safe to use (e.g. we trust all the engineers working in any of the ISP we use won't sniff/spoof our communication, e.g. by using some fake or quasi-fake SSL certs).

So, that's a hell of a lot of trusting! And the stake is high. Do we really need to make such a sacrifice? Do we really need to hand in all our private data to all those organizations? Of course we don't!

First, notice that in majority of cases, the cloud is only used basically as a on-line storage. No processing, just dump storage. Indeed, what kind of server-side processing does your task list or calender require? Or your freestyle swimming results? Or your conference slides? None.

And we know for very long how to safely keep secrets on untrusted storage, don't we? This is achieved via encryption (and digital signatures for integrity/authenticity). So, the idea is very simple: let's encrypt all the data before we send them to the cloud. The point here is, the encryption must be done by the app that is running on our client device. Not in the cloud, of course.

Ok, so let's say I have my calendar records encrypted in the cloud, how do I share it with my other devices and other people, such as my partner and colleagues at work? Very simple – you encrypt each record with a random symmetric key and then, for every other device or person who you want to grant access to your calendar you make the symmetric key available to this person, by encrypting it with their public key (if you're paranoid, you can even verify fingerprints using some out-band communication channel, such as phone, to ensure the cloud/service provider didn't do MITM attack on you). What if you want to share only some events (or some details) with some group of people (e.g. only your availability info)? Very simple – just encrypt those records you want to share in non-full access with some other symmetric key and publish only this key to those people/devices you want to grant such non-full access.

Implementing the above would require writing new end-user apps, or plugins for existing apps (such as Outlook), so that they do encryption/decryption/signing/verification before sending the data out to the cloud. But what stops the malicious vendor from offering apps that would be leaking out our secrets, e.g. the keys? Well, nothing actually. But this time, the vendor would need to explicitly build in some kind of backdoor into the app. The same could be done with any other vendor, and any other, non-cloud-based app. After all, how do we know that MS Word, which is not cloud-based yet, is not sending out fragments of our texts to Agent Smith? Note how different this is from a situation when the vendor already owns all our data, unencrypted, brought legitimately to their servers, and all they need to do is to read them from their own disks. No need to plant and distribute any backdoors!

In practice few vendors would be risking their reputation and would be willing to build in a backdoor into an app that is then made available to customers. Because every backdoor in such client-exposed code will sooner or later be found (You would really not believe what great lengths all those young people aimed with disassembler and debugger would go to, to win an economy class ticket to the middle of desert in the hottest summer season, just to be able to deliver a presentation on how evil/stupid a company X is ;).

One problem is, however, with accessing our encrypted cloud over a Web Browser. In contrast to apps, the web browser content is much less identifiable. An app can have a digital signature – everybody know its an App v 1.1, published by X. As explained above it would be rather stupid for X to plant a backdoor into such an app. But a Web-delivered Javascript is much more tentative, and it's very possible for X to e.g. deliver various versions of scripts to different customers. Digital signature on client-side scripts, paired with ability to whitelist allowed client-side-scripts, would likely solve this problem.

So, why we still haven't got client-side-encrypted cloud-services? The question is rhetorical, of course. Most vendors actually loves the idea of having unlimited access to their customers data. Do you think Google would be happy to give up an opportunity to data mine all your data? This might affect their ad business, health research, or just Secret Plan To 0wn The World. After our dead body, I can almost hear them yelling! After all they have just came up with Chrome OS to bring even more data into their data mining machine...

To sum it up, there is no technical reason we must entrust all those people with our most private data. Sooner or later somebody will start selling client-side-encrypted cloud services, and I would be the first person to sign up for it. Hopefully it will happen sooner than later (to late?).

This post also hopefully shows, again, one more aspect – that we can, relatively easy, move most of the IT infrastructure out of the “TCB” (Trusted Computing Base, used as metaphor here). In other words, we can design our systems and services so that we don't need to trust a whole lot of things, including servers and the networking infrastructure (except for its reliability, but not for its security). But, there always remains one element that we must trust – these are our client devices. If they are compromised, the attacker can steal everything.

Strangely most people still don't get it, or get it backwards. Just the fact that “information is not stored on the iPad but kept safe on the corporate network”, doesn't change anything! Really. If the attacker owns your iPad, then she also can do anything that the legitimate user could do from this iPad. So if you could get to the company's secret trade data from your iPad's Receiver, so would be able to do the malware/attacker.


Simon said...

Great, as usual.

Anonymous said...

Great writing.

Anonymous said...

You might be interested in Firefox Sync (formerly Mozilla Weave), the protocol used in Firefox 4 for synchronization of user bookmarks, browsing history, etc., between multiple instances of Firefox on different systems. It doesnt exactly match your proposed system but does have the same general goal of storing only encrypted data in the cloud.

Google "firefox sync" for how it's explained to end users, and "how does weave use cryptography" for an explanation of the underlying encryption scheme.

Anon said...

As for online/cloud storage there are already a few services which provide a "client-side-encrypted cloud service".
Take the swiss "Wuala" for example: You can store and backup your files, sync them between your computers and also share them to other people. It's based on "Cryptree, a cryptographic tree structure which facilitates access control in file systems operating on untrusted storage." (
Every file that leaves the client gets encrypted, split into smaller fragments (Reed-Solomon) and then sent into the distributed cloud (meaning: some Wuala-servers in Europe and many peers).
You can also mount the online storage into the filesystem to access your files faster. It also has file versioning and deduplication ("global deduplication" unfortunaly but that's presumably gonna change/become optional afaik).

More infos about the security here:

It's my first try "clouding" my data because this time I don't need to trust the company but only their client software. Unfortunaly encrypting everything before uploading also means there's no way to retrieve the password if you forget it but we will have to become accustomed to that I think because that's simply how encryption rolls.

Shmerl said...

I agree fully, and like the previous commenter also want to point out Mozilla's sync:

Such kind of encrypted services are supposed to be the norm, not the exception.

ciastek said...

LastPass does that -

Anonymous said...

With all the API buzz for each service would it be hard for security community to develop tools for client side encryption?

Ryan M. Ferris said...

It's difficult not to be convinced of a plot against privacy. Whether it is location tracking cell phone (being abused by the feds), "cloud computing", Operating Systems and Hardware with obvious security flaws, long-term 'co-operation' between telcos and governments...

Do you we each need to invent our own protocols, cellular technologies and encryption to achieve privacy?

zintia said...

very interesting but.. for instance i have a lot of documents in
how can I encrypt them?
is is possible?

Anonymous said...

More passwords to manage? It will never become mainstream.

Joanna Rutkowska said...

@Anonymous: Whether you must manage 3 password, or 33, or 333, or 3333 passwords -- it all takes the same amount of effort. In each case you should remember only the master password/PIN.

Anonymous said...


Any comments on Wuala service?

C. Brocas said...

As previously said, client side encryption effort exist in mainstream software/service like Firefox (Sync). You are able to rely on Mozilla Sync cloud storage or on your own sync server.

Another effort is syncany ( ) where plugins provides you different types of storage backends (imap, rackspace, amazon etc) for your encrypted file chunks.

You are totally right johanna, the last thing that owns all the keys is ... the client side device.

But nowadays, in companies, when you are speaking about security aspect of a cloud project, even client side encryption/protection is not seen as a requirement. It is often seen as a cost for a non real threat. So speaking about client device security, you are right, it is not as easy as it should be.

Great article.

Bye Christophe

Anonymous said...

Good read, thanks Joanna !

Do you know about Syncany? It's a cloud storage with client-side encryption and multiple storage types (from buckets to imap or images).

Joanna Rutkowska said...

What I really would like to see is a client-encrypted calendar/task list service. With apps for iOS, Mac, Linux.

Cool that such things as Wuala, Firefox Sync, or Lastpass exist, although I cannot say anything about their security. Why they don't publish the sources of their client code? (except for Firefox Sync I guess)? Do they sign at least their client apps?

Dieter Adriaenssens said...

Interesting read.

Some time ago I was thinking about protecting data you share on social network websites, and I came to a similar conclusion : encrypt everything before it leaves your PC/device :

I guess this principle not only applies to clouds, social networking, internet data storage, ... but to all data you are willing to share with others on a public network (if you are conscious/paranoid enough to protect it from being read by anyone else).

Anonymous said...

A partial approach would be to have some kind of proxy that encrypts any files that go to "the cloud" on the fly. This would avoid modifying the application and even allow you to move this encryption engine to a different VM. Of course, cross platform support might be tricky but could possibly be done at the router level (with an open source router)

Nicolas Wagrez

d2 said...

add passpack to the list of encrypted cloud storage apps you didn't know about.

As for publication of code, how'll you ever ensure that the live code is the same as the published code (what's the point, in other words)?

Joanna Rutkowska said...

@d2: the usual way to ensure code matches the source code is to build the code (compile it) and compare the hashes.

Anonymous said...


Surely the purpose of social networking is 'to share' - thus, the concept of putting data on a social network which you wish to keep private is kind of an oxymoron, surely?

Joanna Rutkowska said...

@Anonymous: I might want to share my holiday photos with just a group of friends, and not necessary with the whole rest of the world.

Anonymous said...

What program do you use as keychain?