What is Data Portability?

Recently when I was attending the Data Portability Steering Group Telephone Conference and also earlier on different postings in the DP Action Group’s mailing lists and Skype Chats one thing which wasn’t really that clear was, what Data Portability actually is. Many people were invited to submit their view either via a video (deadline March 31st) or by posting about it.
Now we have quite a range of definitions and ideas and also some sort of philosophical discussion about what „Data“ and what „Portability“ means.

The definition of what Data Portability actually is, is of course important as it enables people to finally work on a technical and policy blueprint. It also will make discussions easier as there (hopefully) won’t be 10 people with 10 different views on what they want to achieve in the room.

My concern is though that when we try to define Data Portability very general we will still have these different views as Data Portability maybe means something different for different areas. And if we look back it seems that Data Portability actually evolved from ideas around making profile and social graph data more portable so that a) you can more easily move your friends/contacts to another service or b) signup will be easier as profile will be fetched from somewhere else and your social graph is automatically prepopulated. This at least is what Michael Pick’s video is about.

Fields of Data Portability

Now I would like to go a more pragmatic route in how to define what we work on by looking at specific fields and deciding what the problems to solve are for this field. Examples of fields to look at might be:

  • Profile Data
  • Social Graph
  • Multimedia Assets (photos, videos, text, etc.)
  • Virtual Worls assets (objects, scripts)

and maybe many more.

Levels of Data Portabilitya

Then we might also have some levels of Data Portability. For instance my vision would be what I described in my first podcast episode on that topic which is being able to automatically synchronize my social graph over all sorts of networks. The same migth be true for my profile data and so on. The problem is of course that to make this happen is a big undertaking and we are probably not able to tackle this right at the beginning. Thus I was thinking of different levels of how to implement Data Portability such as:

  • Export (describe standards and policies for how and when data should be exportable)
  • Import (probably mostly a policy decision on when to allow people to import data from other services)
  • Discovery (provide some mechanism which takes note of where you have data stored and how to retrieve it)
  • Transfer (allow people to move their data from service A to B. Maybe mostly a policy decision)
  • Synchronization (as described above, automatically synchronize data sources with each other)

Having these levels we might then be able to apply these to the fields defined before and we can choose one of these combinations to start with (like exporting profile data). Some of these levels are of course more geared towards the Policy Group (such as allowing import) and some more towards the technical group (what standards to use).

Here is a little graphic I made with this idea of fields and levels in mind. I also added some technical standards where I think these might be part but of course it’s only a sketch. There is also nothing in import because I think it’s mostly a policy problem if export exists. Of course Discovery (what is missing here right now) can also be part of the import process for querying places from where to import.

Then there might be general principles we can apply, like some basic guidelines we should always follow such as privacy implications and that those exported data should maybe not be publically visible unless the user has choosen so. There are also questions like „who owns what data“ etc. But I am not sure we can answer that. Maybe we can say „profile data is owned by the users associated with this profile“ but what then about social graph data? It is indeed shared data. So maybe we can also circumvent this not easy question altogether and discuss instead what settings a user should have available, like defining for each field at least whether it’s public or only authorized users can see it. Or if it should be exportable.

So to summarize all this, my proposal would be to

  1. define general principles which we want to follow
  2. define fields and levels and select one for now
  3. define technical and policy blueprints to cover it.

So much for my $0.02 ;-) What do you think? Comment here or call in at +1 (206) 350-4566 and be part of my next podcast episode :-)

Oh, and of course comment also on the Data Portability mailing list!

Technorati Tags: , , , ,

Teile diesen Beitrag