In the future will security metadata be larger than the data it is protecting?
I'm no expert, but... there is a problem that i've been trying to figure out for months. Ownership of data. We post pictures to Flickr, status about our lives to Facebook, companies store customer information in Salesforce.com and we store documents in Box.net. The cloud... we are putting more and more information in the cloud. It is fantastic!
We are trusting the YouTubes, the Facebook's, the Saleforces of the world to be the guardians of our information and this comes at a risk. There have been incidents with bugs in the software of such services have accidentally exposed the data. For many organizations this poses the question of what happens when they are obligated by laws and regulations which specify they have to control access to this data, yet they don't actually own the systems where it resides?
Then think of the hackers, attackers, script kiddies trying to get access to this data. Over the past 10 years the local computer has been the target of viruses, worms, trojan horses. This has led to the rise of companies like Symantec and McAfee, developing software to secure your local computer. Then finally turned their security software to the servers and the network. As we store more and more of this information in the cloud, *cough* I mean internet, we are relying more and more on the security of those systems and we no longer have any real control over the files, but yet we still "own" the data don't we?
What is the definition of ownership of data? Some would even argue nobody owns data but are simply stewards. If I take a photo with my digital camera, transfer this to my laptop and then upload to Picasa, do I not own this image no matter where it lives? If I own something, should I have the rights to define who can access it? If anyone? There have been many examples where kids have posted an image or photo onto Facebook and within seconds regretted it, but once information is out there, it's gone. Wikileaks is a classic example. One supposedly trusted person removes data from a trusted source and gives it to an untrusted party, and then once it hits the internet, hits the BitTorrent networks, it's gone. Forever beyond the control of the owner of the information. A recent ruling buy the Press Complaints Commission (PCC) in the UK stated that Twitter posts are not private which further extends the acceptance that we have little to no ability to really control who owns and access our data.
Of course there are technologies which address some of these problems now, Information Rights Management (IRM) is one technology that wraps documents and emails in encryption and access controls so that you can share a document beyond the traditional enterprise security perimeter but only those with authorization can access it. But IRM has limitations, it only supports a specific list of document formats on a limited set of platforms. Whilst it does bring a lot of security and data ownership functionality over and above what can be done with unprotected content, there is still a lot to improve. On similar lines a group of German researchers have created some software called X-Pire which uses cryptography with images to allow people to control access to photo's they upload to social networking sites. Of course the idea requires software that isn't built into the browser or operating system or device, and so limits the potential for widespread use.
If I think into the future, what would be the answer? What if every single piece of digital information a user generates will carry a plethora of security metadata?
Imagine this, I pickup my iPad and authenticate with it. In some way I prove I am who I say I am. Now i'm going to quickly gloss over this stage, because that is an entirely different subject i'll be trying to understand another time. But lets say i've successfully created a trust relationship with my iPad, it knows that when I type, it is me typing.
Now consider if that EVERY character I type has my identity associated with it. Say I type into a Facebook status update "I love my wife." As I punch the "I" key, the device also stores, WITH the data, a time stamp, my GPS location, a unique identity for me, details of the device I was using. To ensure all of this secured, lets say we sign it with a cryptographic key that is unique to me, lets say my DNA sequence. So for a single character, stored as ASCII, would use 8 bits of data. Now if we used DNA and other pieces of data, GPS location and such, that would be over 700MB of data for each 8 bits! This seems insane, a stupid idea... but is it?
Imagine if so much data was associated with everything I did and was stored everywhere that data went. In a court of law you would have absolute proof that someone wrote something. Because there would be so much evidence tied to the data.
It could get crazy very quickly, what if someone cut and pastes the status update from Facebook to their own blog? Would it copy all my data and add their data and action to it? So that we could have a record of who copied what, from who, when and where to? Using the entire DNA sequence would be over kill, we could use a hash to generate some unique data and reduce the overhead. But still we would be talking about a 100 or 1000 times increase in data size for a simple piece of text.
Then extend this idea to images. Imagine each and every pixel has an enormous amount of metadata associated with it. Google Maps already has to respond to people wanting their faces removed from images. I noticed that all the car number plates are automatically (I hope, or does Google have a large room of people looking for car number plates?) removed from images. Imagine if each pixel was able to carry metadata that identified it as being part of an image which represented you and therefore you had a way to access and verify this data.
Then what above moving images? TV and film?
Is this even possible? Increases in storage would need to be immense, CPU computing power would need to increase and how do we verify all this information? Where does the verifying service live? Do I own it? But lets go back 50 years and explain to someone that, one day, they will be able to capture an image in digital form, put it on a website and within seconds it is indexed by a search engine and available for viewing by millions of connected people all over the world. 50 years ago it would be hard to comprehend how such a system could be in place. Never mind one simple enough that a 70 year old could use.
In 50 years if the laws of progression we see today hold true, storing megabyes of metadata for bits of information may seem possible when Petabit storage is standard on your cell phone. It could be possible that the metadata which describes and defines the security of the data itself may outweigh the data by a significant factor. Such an idea is also going to need significant changes to the hardware to allow this data to be created, processed and acted upon. It would need a big hardware vendor to team up with a security vendor to bring these worlds closer together... Possible? Maybe...
Into the wilderness…
I've spent 12 years working with security technologies. I started out my professional career building the first steps into the world of ecommerce for one of the worlds largest computer manufacturers. I then went onto a start up company as a developer working ona document security (IRM) solution. Over 8 years I went from development, to leading a QA team, to support, to consulting and finally technical sales. In this time i've learned a few things here and there about security and picked up a CISSP. The start-up was ultimately acquired by one of the big technology companies where I worked for a few years. Recently I needed a change scene, a breath of fresh air and did the rounds interviewing with many security companies for a variety of security related roles. Over a few months people kept asking me many questions;
- Where do you see yourself in 5/10/15/100 years?
- Explain your ideas for improving/selling/developing our security technology.
- What are your views on identity/cloud security?
I started to think what DID I really think?
I ended up with some odd ideas, some stupid ideas and some I thought might have some legs. So I wanted to start writing down some of these ideas and in the process of formatting, editing and reviewing my notes I could see if I was onto something new. Now in this age of technology it would be a shame not to let others comment, applaud or most likely, abuse the writings and hence this blog exists.
So don't expect anything other than my wild opinion and don't expect anything written to be "correct". Do expect me to explore things that may, or may not be sensible, plausible or possible.
So... here we go...