Ari Bader-Natal

User-Initiated Privacy for Web Applications

Filed under:

** A version of this blog post is included in an Chris Seibold's new O'Reilly book: Big Book of Apple Hacks**.

Web-based applications are becoming increasingly popular, offering a variety of compelling advantages over desktop-based applications, both to developers and to users. These applications are platform-independent, accessible from any Internet-connected computer, offer offsite data storage, and often provide integrated tools for collaboration and sharing. One major tradeoff, however, is a loss of privacy. As users adopt web-based applications, their personal data (e.g. emails, address books, calendars, to-do lists, etc.) slowly migrates from the privacy of their computer to instead live on various web-app provider's servers scattered across the Internet. While some of these web applications allow a user to flag certain data as "private", this is a very limited notion of privacy, referring only to whether the web-application provider will share user data with other parties (such as other users.) The implicit message here is that the user's data is always accessible (i.e. not private) to the company or individuals providing the web-application. This is a step back from the level of privacy afforded by desktop-based applications and should be recognized as such. But this doesn't mean that we need to give up on privacy (or give up on web applications.) We just need to think more creatively.

After reading Peter Wayner's book, Translucent Databases, on Jon Udell's recommendation (see Achieving translucency), I saw that "translucent" database designs (?) could directly address this issue. Unfortunately, many (most?) web-application databases are not being designed translucently (more by Udell here and here). But if your web-application's database wasn't designed for translucency, is this a lost cause? I'm going to argue that it isn't, and will show how you, the user of web applications, can initiate database translucency yourself, and thereby protect the privacy of your hosted personal data whenever you desire.

What do I mean by user-initiated database translucency? Think of it as BYOC: Bring Your Own Crypto. The idea here is for you, the user, to encrypt your personal data before it finds its way onto the web application server. As long as the encrypted data is considered valid by the application (e.g. doesn't violate string-length or legal-character limitations), the application will continue to work as it did before, but the personal data will remain private. Then, when you're later interested to view some of this data, the decryption and viewing can be done offline. If you do this right, your data will remain usable to you in the context of the web application without ever being visible (unencrypted) to the web application provider.

I'll describe one approach to implementing this idea, which you can download and test out. Many others approaches are possible, and I'll throw out a few ideas to get things started. If you implement one, please email me and post a comment below.

Page Axe (http://code.aribadernatal.com/PageAxe/) is a Mac OS X (i.e. "offline") application that I wrote to demonstrate this idea. Upon running for the first time, Page Axe generates and saves a randomly-generated 256-byte key (via openssl rand -base64 -out /path/to/key 256). After that, any text typed into the Page Axe text box is encrypted with this key using the AES-CBC cipher algorithm (via openssl enc -aes-256-cbc -a -salt -pass file:/path/to/key). This encrypted text is copied to the clipboard, ready to be pasted into a text field in your web application. Page Axe also allows for viewing of this encrypted data. Copy and paste the encrypted text from the web application to Page Axe, and the text is decrypted (via openssl enc -d -aes-256-cbc -a -pass file:/path/to/key) and displayed for you to read again. At it's core, it's simply moving text between trusted desktop-land and untrusted browser-land in a way that guarantees that data privacy is maintained.

Here are a few screenshots of the Page Axe application in action:

Simple interface for encrypting and decrypting text.

Encrypted text is pasted into a web application.

The encrypted text can then be decrypted offline.

Select a block of text from the web application that includes encrypted private data, and Page Axe will locate, decrypt, and display the private data in the context of the entire text block using Growl.

Alternative UI hooks are possible, such as this Quicksilver trigger for text decryption.

Page Axe is only one implementation of this concept of user-initiated privacy for web applications, written as a full-blown desktop application. Alternatively, one might alternatively figure out how to implement this as a Firefox Add-On, a bookmarklet, a platform-independent Java application stored on a portable USB (flash) drive, or perhaps something else. As long as the user's private data is never accessible through the DOM to the "untrusted" web application, you've got a valid implementation.

I think one fascinating possibility here would be to incorporate this technique into applications designed to automatically sync offline and online data. Consider, for example, Spanning Sync, an application designed to provide two-way syncing between Apple's iCal desktop application and Google's web-based Calendar application. Imagine a new "Keep data private" checkbox, which causes offline data to be encrypted before being uploaded to Google's servers and causes online data to be decrypted again after being downloaded. (For access to Google's web application data on-the-go, a mobile implementation like Page Axe would provide access.) This example shows how data translucency can be initiated post-hoc via the web application's published API! Many interesting possibilities exist here.

In summary, the move towards web-based applications comes at the expense of our privacy, but with the techniques outlined here, you can reclaim the privacy of your data any time you like!