Taking Back Our Data
Updated: Jul 19, 2020
As I go about my day interacting with people and businesses, I am regularly astonished by the amount of effort it takes for me to share structured information about myself with them. I get this feeling that's hard to describe - that my personal data, the information that represents me digitally, doesn't really belong to me - instead, it's fragmented and distributed among a massive amount of third parties. I'm not so much talking about traditional data storage, like documents, pictures, and videos (although I am starting to lose track of how many different cloud services host my files), but more about the information that describes me as an individual - my history and my preferences.
I'll give some examples from the last few weeks.
I'm starting up at grad school in September and the school needed to know my immunization history. Given that I've moved several times in my life (who hasn't these days?) and I've changed doctors and insurances, this became an absurdly complex affair. I had to:
Scan the collection of random pieces of immunization history papers I've gathered over the years
Contact my old doctor and have them fax a copy of my TB test results
Contact a separate doctor's office and have them reproduce proof of my Hep B vaccine
Compile all these together and set up an appointment with my current doctor to look at them and fill out a sheet that replicates all the information in one
This whole process took many hours out of my life spanning over several weeks. I couldn't believe in the digital age that we live in that this wasn't a one-button process:
Go to a my health profile page (wherever that is)
Click "Share with" and put in the school's name
Why on earth is it more complicated than that?
A few weeks ago I signed up for Amazon HBO (so I could watch Silicon Valley!). Upon signing up, I was immediately greeted with a slew of movie recommendations that have nothing to do with what I like to watch and was just an amalgamation of everyone else's preferences. But I have a Netflix account, and a Google Play account, both with ratings and video purchases. Why on earth do I have to go through and rate videos on Amazon now? I should be able to:
Go to a my movies page (wherever that is)
Click "Share with" and put in Amazon
Why on earth doesn't this exist?
I signed up for TSA Pre-check and had to give them information about where I've lived and my background. Doesn't this information already exist out there?
I needed to get a background check for my job and had to spend hours compiling lists of previous addresses and contacts, all information that I've needed many times in my life. Why can't I just share it?
I'm applying for a new job and have to re-type my resume and employment history in their format
I'm signing up for a savings account with Capital One and I have to re-share with them my contact and banking information
I'm visiting a new tea shop and I have to fill out a form to get a card for their loyalty program so I can get my 7th tea free
This sort of stuff happens on a daily basis. We've all become data entry experts of our own data, repeatedly giving information to new people and businesses when needed.
This begs the question: why are we, as a sophisticated technological species, spending so much of our time and energy on entering repetitive data? And why aren't we getting anything out of it? The reason is because when we give out information about ourselves, it's always in the context of giving it to some third party, and thus they own the information we give them, and they have no incentive to make it easy to get back out.
This workflow of information transfer has several concerning side effects:
We don't store or have ready access to our own data
There is no motivation for data consistency between platforms
There is a gross level of data redundancy
Aggregated data about people is fragmented along company lines
There is a data war going on out there, with the Internet giants like Facebook, Google, Amazon, and Apple coming out on top, fiercely fighting each other to try to gain the largest market share of personal data. We, as individuals, are the civilian casualties of this data war, going about our lives with our digital information parceled up into bits and spread around by the tide of the most compelling market forces.
I want this to change, and I believe it's not particularly technically challenging to do.
I envision a world where each individual has a personal API that serves up their information that they are in control of. This API can be provided by a server they host at home, or by a cloud hosting provider, but either way, the server would be owned by the individual. This API would serve as the interface between third parties and an individual's digital information. Such an infrastructure, if done properly, could yield interactions of this form:
You find a new service you want to try, i.e. a video streaming service.
You sign up for the service by giving the streaming service your domain name (i.e. myname.com)
The service queries your API to get permission to access your basic information (name, phone number, e-mail) as well as your movie preferences
You give it permission
The service queries your API to get permission to add new movie watching history and/or movie ratings
You give it permission (optionally)
Movie recommendations from this service now incorporate all movie preferences you have, and future services can now access movie watching/rating history from this service
You can imagine this kind of workflow for all kinds of information - health, social media (imagine - API to API interactions between friends), demographics, payment, contact information, etc.
Of course, there will be a great deal of resistance from the Internet giants to adopt this kind of workflow - what incentive do they have for interfacing with your API? But, I think the solution is straightforward here - if consumers tend to choose services that interface with their API, it will demand adoption from larger companies if they hope to continue to compete. The hardest part is getting started - getting consumers to like this infrastructure, giving consumers benefits for using it, and getting a few services with high traction to adopt the workflow.
Another benefit for both individuals and companies that this workflow provides is that it makes it so that online services are no longer in the business of trying to hoard data, but rather provide unique and desirable services to individuals. The #1 leader in social media or health data analysis or video streaming services will no longer be the one with the most data, but rather the one that provides the best user experience. And isn't that what we all want? This is already the trend in history - to always move towards better consumer goods, and more production efforts spent on novelty and less on repetition.
Fast forward a few decades, and picture for a moment what kinds of services and interactions could exist, if each individual had their own personal API:
Data entry becomes an outdated chore, both for individuals and for businesses interfacing with consumer data. Applying for a job, or a credit card, or a home loan, becomes one button.
You sign up for a new service (new augmented reality interactive TV shows!), and it can immediately give you recommendations by looking at your other preferences (books, movies, etc.) and recommending what other people like you enjoy
Health analytics platforms can make recommendations about lifestyle choices based on what other people with health backgrounds similar to yours have found to be helpful
Marketing firms, instead of paying a third party to market to people based on demographics, could now query, advertise to, and pay individuals directly, if they choose to enroll
When your calendar and entertainment preferences can be communicated with your friends, finding fun things to do becomes trivial (i.e. "Give me movie recommendations and times to go see with some friends this weekend")
I'm sure there are countless other applications, but this is the flavor of things that I foresee as possible, in a world where we own and manage our data.
This is what I hope to accomplish. And I'm going to need a lot of help. Who wants to join me?