Ian Ardouin-Fumat

Overview / Process

Deface is an app that prevents Facebook from reading people’s messages.

While messaging apps such as Signal and Whatsapp have successfully brought encryption technology to people’s private conversations, privacy is still lacking on public feeds themselves. To address this issue, Deface leverages encryption and peer-to-peer technologies in order to protect people’s content, facilitate conversations with their friends, and keep algorithms at bay.

09/18/2018

TL; DR

Deface is a work in progress. This page is a comprehensive list of resources for understanding the work produced so far and the road ahead of us. At this stage of the project, we hope to open our process to a wider community of programers, designers, lawyers, ethicists, and funders. The source code will be made public on GitHub in a few days.

09/19/2018

Privacy abuses are not a bug but a feature

Facebook's online privacy abuses have been known of the public for more than a decade. Every passing year has brought new chilling revelations of data misuses. Some of the most recent include the ways Facebook manipulates its news feed for research purposes, its racially-biased protections against hate speech, and cases of redlining in ad targeting. The Cambridge Analytica scandal and subsequent senate hearing in 2018 revealed these abuses were not bugs, but integral features of the platform.

Photo: Jim Watson

In a similarly abusive fashion, governments in the United States and many other countries have been ramping up their efforts to collect their citizens' information. Amidst the rise of populism and authoritarianism, legislators and law enforcement agencies have been advocating for personal data retention and encryption backdoors. Recent examples include FBI chief Christopher Wray equating unbreakable encryption with an 'urgent public safety issue,' or ICE calling for tech companies to develop algorithms that track visa holders on social media.

Several tech companies have been under fire for actively collaborating with anti immigration efforts. Most prominent offenders include Microsoft and Palantir, but it turns out many more are involved.

The future looks grim but the threat is also very real at the present time, as major tech companies comply with governments by turning in large swaths of personal data records. Under the false pretense of fighting against extremism and fake news, there has been a historical effort across the political spectrum to undermine the last safeguards against mass surveillance.

Thankfully, the rising popularity of encrypted messaging applications like Signal means more and more people can communicate securely. It’s however far from sufficient: the most recent news have showed these apps are pressured to give in user metadata to law enforcement, and are sometimes entirely censored by authoritarian governments. Perhaps more importantly, their efforts are inherently limited to private communications. If we are serious about protecting privacy, we need to think about how to cover the full range of people's online presence.

Given the ever-growing threat posed by corporations and governments to online privacy, how can we appropriate social media feeds to make the internet more secure, decentralized, and open? Deface is the beginning of an answer to this question.

Deface is a browser extension that prevents Facebook from reading people’s messages.

While encryption usually happens at the private conversation level, Deface protects content meant to be shared with many friends. It captures messages posted on Facebook, encrypts them, and posts encryption keys to a decentralized database shared by all users. People who have installed the extension can read those messages as they normally would, while others can only see a garbled string of characters. Facebook's algorithms are kept at bay by a captcha system enforced by the community.

A Deface message as seen by Facebook (left), and as seen by a user (right).
Deface could benefit to the public in several ways
  • The adversarial approach taken by the project would spark a debate. It would question Facebook's unethical practices, and challenge the established view that there's no possible privacy in public online spaces.
  • It would effectively prevent Facebook and third-party organizations from sweeping through millions of messages with their algorithms. This could drastically increase the cost of mass surveillance. For instance, law enforcement agencies like the ICE would have to track immigrants on a case by case basis instead of monitoring millions of people automatically.
  • By disrupting Facebook's ability to read people's message, Deface would also prevent the company's algorithms from overly curating people's feeds and passively promoting extremism and hate speech.
  • As an open source tool and research project, Deface would provide a replicable framework for decoupling user-generated content from the online platforms where it is posted. This could pave the way for many other projects of appropriation of social media spaces.

Deface has been in the making for several years and is now reaching a point where I can confidently share it with the world. For a long time I have been reluctant to make this project public before its actual release, but this past year made me realize the necessity to open the process to a wider community in order to tackle the many social, philosophical, legal, and technical challenges ahead of us.

09/20/2018

How does it work?

Deface augments Facebook's interface with a single feature: encrypted timeline posts. When enabled, Deface captures and encrypts messages as they are written, right before they are posted to Facebook's servers. Deface turns content into garbled strings of characters, and makes people's communications unreadable by Facebook and third party organizations, including advertisers, government agencies, and malicious actors. Under the hood, the application stores encryptions keys in a decentralized database shared by all users.

Deface encrypts people's posts and distributes within the community the keys that unlock them.
Privacy comes at a (relatively small) cost
  • First, encrypted messages can only be read on devices where Deface is installed. This means that people who don't have access to Deface will not be able to read content posted by their friends. This could potentially break people's experience of Facebook, which is why we will always leave the option to post unencrypted messages available to all. Additionally, a link to download Deface appears at the bottom of every encrypted message, which we hope will drive user adoption.
  • Second, in order to protect people's content from bots and algorithms, the peer-to-peer network underlying the application enforces a verification system based on ReCaptcha. This assures that only humans can access content, at the cost of users having to solve captcha puzzles every once in a while.

It is critical to understand that Deface doesn't provide bulletproof encryption of the sort offered by tools like Signal or PGP. All it takes for a malicious actor to read someone's Deface messages is to install the app, and become Facebook friends with them. What Deface provides is protection from algorithms that sweep through millions of messages at once. In other words, Deface doesn't make any assumption as to who should access anyone's content, it only enforces a "humans only, no algorithm" policy. The purpose of Deface is to raise the cost of mass surveillance by providing soft encryption to wide audiences.

09/22/2018

An in-depth look at Deface's distributed architecture

From a technical standpoint, one the main challenge in developing Deface is to design and implement its peer-to-peer architecture. Deface relies on blockchain technolo... Just kidding. Deface relies on js-lip2p2, a networking stack currently developed by amazing folks at Protocol Labs. In this post, I am detailing my process in designing a system that is secure and scalable. I hope it will inspire new uses for these technologies, which I believe open truly exciting opportunities for subverting social media spaces. However, please take everything written here with a grain of salt: the technology I am working with is relatively experimental, and I have a lot to learn when it comes to distributed computing.

To this day, Deface's architecture is still a work in progress, but it has come a long way since its initial protype in 2015. Initially build as a centralized, symmetric encryption system (yikes), Deface quickly shaped up to become distributed. I explored a number of technical solutions, including asymmetric encryption program PGP and mesh network protocol Telehash, before realizing a Distributed Hash Table (DHT) was the kind of system I was looking for.

Distributed Hash Tables, a crash course

A DHT can be described as a decentralized database in which each user manages a small segment of the full data set. DHTs have been around for a while, as they have been in use in many file-sharing services — think Napster or BitTorrent — over the course of several decades. Only recently, however, have they become a tool for web applications, thanks to the progress made on Libp2p's javascript implementation based on WebRTC. In Deface's instance, a DHT is useful for storing the encryption keys to each message, as it provides a system that doesn't have a single point of failure (e.i. the FBI cannot force us to compromise the service) and operates with limited resources on the admin's end.

Here's how a DHT works in principle:

1. uniform id space

To organize data, a DHT assigns addresses to both users (nodes) and data within a uniform index space: every node and data points has its own id ranging from 0 to n (15 in the diagram above). In Kademlia DHT, users and content are distributed within this space to form a binary tree structure, which facilitates finding which node carries which piece of information.

2. routing table

This organizing principle enables each node to maintain a routing table (i.e a peer directory) that acts as a map for looking up nodes and data. Every node has knowledge of a few peers in the DHT, but this knowledge combined with those of others is sufficient to store and access data all over the network.

3. communications

Once connected together, peers can start storing and querying content within the distributed hash table. In the case of Deface, this means users can push newly generated encryption keys, and request those of other people. Nodes route 'put' and 'get' requests throughout the network until they reach the address they are targeting.

A more thorough introduction to Distributed Hash Tables. Content from SAFE Pod Montreal
Deface's very own distributed database

Over the course of three years, I played around with several implementations of Kademlia DHT with little success, and ended up finding out about IPFS and its networking stack full or promises: Libp2p. A couple years later, Libp2p's DHT implementation had reached a stage I could start working with, which led me to spend a year experimenting with it. Here again the process involved trial and error. I initially relied on libp2p's content routing system, until I realized tapping directly into its DHT tools would help with content distribution and verification (besides, Deface's data is so small it easily fits in a DHT's key/value system). I then designed a system where each node constantly monitored a few others, but quickly found this scheme had critical security flaws. I ended up customizing libp2p-kad-dht with my own features, including append-only logs, a peer verification method, and a more selective ping system.

Below are the changes I've made to accommodate Deface's constraints:

4. need for verification

In order to prevent Facebook and other undesirable actors from sucking all the content out of the database, every node in the network must verify the good standing of peers they share content with. The system must also monitor each node's activity, to prevent them from querying unreasonable amounts of information.

5. every user verify themselves before posting requests

To tackle this, Deface users solve ReCaptcha puzzles and store the resulting verification tokens locally. Then, any content request from them are prefaced by posting a record to the DHT that contains two identifiers: one for their verification token, and another one for the current request.

6. content providers check for verification

In order to verify the requesting node's identity, data-holding peers seek the request log at the address it was pushed during step 5, and check two things: whether the verification token's signature matches, and that the user has not sent too many requests using to this token. This whole mechanism makes sure nodes are not abusing the database with an unreasonable amount of queries.

7. not everyone should be allowed to store content

Another concern of Deface is to avoid trusting non-participating peers with data storage, since malicious actors could simply launch a bunch of nodes and end up having passive access to a large portion of the data records.

8. DHT ping

This is addressed during the process of refreshing one's peer list, called pinging. When a node discovers a new peer, they ping a portion of their routing table and filter out the contacts that have provided the oldest verification tokens. This enables the network to organically weed out connections with peers that are not a legitimate part of the community.

9. unverified peers are denied a role in storing data

Additionally, nodes that seek peers to store new data they're sharing with the network will deny any transaction if it appears the receiving node can't provide any recent verification token.

Towards open source

The source code for these changes to the libp2p-kad-dht and interface-datastore modules is available as a WIP here and here. As of version 0.2.0, the changes mentioned are part of forks of those two modules; in the future we will consider including them as part of wrapper modules for the sake of easier maintenance.

Do you have opinions about this? We need help! The codebase is currently in process of being refactored and made open source, but I am very much willing to opening up this process to a wider community of developers and computer scientists. If you have any thoughts I'd love to hear them: you can find me on Twitter or via email.

10/09/2018

The challenges ahead of us

Today, Deface exists as a working prototype for Google Chrome. There's many tech challenges lying ahead of a public release, including backend architecture, frontend development, and a more thoughtful pass at user experience.

Technical problems...

Most urgently, I need confirmation that the peer-to-peer backbone of the application is sound. Building distributed computing sytems is notoriously hard and I am not expecting to get it right on the first shot, which is why I hope to connect with peer-to-peer experts and together work out the most secure and scalable architecture possible. Besides the shortcomings of my own code, I am currently concerned about the lack of decentralized signalling for the WebRTC protocol. In other words, Deface nodes currently need to go through WebRTC's handshake process by using a server as middle man, which in some ways defeats the purpose of creating a decentralized application. In a similar fashion, captcha puzzles are currently solved in a centralized way, since user verification tokens have to route through a server before coming back to the client. In fact, I am generally concerned that there's to my knowledge no open source mechanism for user verification out there. This means that Google holds the keys to the castle and could technically backdoor Deface. There's to this day no independent way to do user verification, which in the era we live in reveals very problematic. Finally, testing Deface's peer-to-peer architecture at scale will prove very challenging.

There's also a lot of fun to be had with frontend development. The most critical task ahead will be porting Deface's experience to mobile platforms, which is the place where most people consume Facebook content, but also where it's the hardest to break the rules. If Deface remains a desktop browser extension it will simply not work. I have some ideas on how to address this issue but it will require a fully dedicated effort. In addition, we will need to develop a system of script injection robust enough to sustain Facebook's obfuscation of their website's HTML structure, and prevent any potential countermeasure. Once again: if you have some ideas, I would love to hear them.

To be successful, Deface needs to provide best-in-class user experience. This doesn't only mean smooth usability from setup to usage, but also teaching people about encryption technology and the tradeoffs involved in security systems. It means providing people with ways to back up if things go wrong, and to not force Deface on those who want to opt out.

Deface was first presented at IAM Weekend in Barcelona, where I was honored to share the stage with Joana Moll and Katarzyna Szymielewicz. Video to come soon.

But of course, all of this will not be enough. Deface raises issues that can only be addressed by rigorous research grounded in fields including law, sociology, and advocacy. These challenges are two-fold: ethics and impact.

People problems...

First, I hope to gather a diverse group of experts and audience members together, in order to discuss some of these issues and start defining a framework for responsible data use. Some questions already stand out in my mind. What are the best ways of telling people what's at stake with their data? How to inform them about the security trade-offs made by Deface's approach to encryption? Does this project obfuscate other types of surveillance like metadata collection? How to mitigate risks the application poses to populations who live in police states? What strategies can be used to prevent crowdforcing? How to anticipate any impact on people with special accessibility needs?

Second, we need to strategize and campaign hard to make this project impactful. Except in few instances, browser extensions don't get a lot of traction. Success in this field requires talking to the press, coordinating with tech communities as well as politically active groups, and preparing localized material in order to be clear and audible. We also need to prepare against facebook's possible legal reaction and make the case for data ownership. This means legal counsel should be sought, and possibly planning for strategic litigation. If you have experience with this sort of campaigning, I would love to talk to you.

... And many other things I have not anticipated.

When I started this project, I candidly thought I would be done with it in a week's time. Three years later, as I am writing this I become aware of the amount of work ahead of us and wonder whether this might be a little too ambitious. This project will likely not hit all the targets I set for it, but that's fine. I don't consider Deface as the ultimate tech solution to online privacy abuses, nor do I think of it as a one-off advocacy stunt. Deep down this is a research effort, and I truly believe every aspect of it that we get right will pave the way for future projects of digital space appropriation. Deface's principles can be applied to any platform, for any sort of purpose, and I am very excited to see where it takes us. In the meantime though, I want to make sure this first attempt is as thorough and impactful as possible.

This is why I need your help. Many of the challenges listed above can only be tackled by an open and diverse community of contributors. This is why I am in the process of contacting experts in human science and computer science fields. I plan to make the entire codebase open source, organize workshop sessions, and include the perspective of non-technical audiences.

In the short term, I need to meet with lawyers, ethicists, distributed computing experts, mobile platform hackers, and people who have experience with open source project management. In the longer run, we will need privacy advocates, journalists, security experts, designers who can create a brand identity for Deface, and developers who will maintain its cross platform experience. I'm sure there's a lot to be done that I have not even considered, so please be in touch if you have ideas. If you have access to funding, or work spaces in New York, or a fellowship program we should apply to, I would love to hear about them. Finally, spreading the word helps a LOT.