[midPoint] Blog: Plans for Data Provenance

Radovan Semancik radovan.semancik at evolveum.com
Tue Jan 28 10:14:18 CET 2020


Dear midPoint community,

Today is a Data Protection Day 
<https://en.wikipedia.org/wiki/Data_Privacy_Day>, which is a very 
symbolic day for midPoint. We are taking data protection and privacy 
very seriously. We believe that privacy in the cyberspace is necessary 
for the free society to flourish. Despite such belief, we acknowledge 
the implementation of privacy and data protection may not be easy. But 
we are not afraid of challanges. We are fully committed to implement 
privacy and data protection features in midPoint.

MidPoint was still quite young when we have realized that data 
protection and identity management are in a very intimate relationship. 
Identity management and governance system are in a perfect position to 
control the flow of identity data. And the essence of data protection is 
about controlling the flow and especially the /use/ of data. In fact, we 
believe that any practical data protection solution must be supported by 
identity management infrastructure. Many people see data protection as 
liability. But we believe that data protection can be turned into a 
substantial advantage when it is implemented properly.

This belief led us to several experiments with data protection 
functionality. We have started several years ago. We presented some of 
the results at FOSDEM’18 <https://evolveum.com/fosdem-2018/>. We 
implemented several experimental features for data protection, such as 
consent management and even more general management of lawful bases for 
data processing 
<https://wiki.evolveum.com/pages/viewpage.action?pageId=24675100>. 
Unfortunately, there was almost no interest for those features in the 
industry and we were not able to secure sufficient funding to finish all 
of them. Some smaller pieces are implemented, but there is still a long 
way to go to get a complete set of data protection functionality.

However, we are not giving up. Now we plan to implement a very important 
feature that has many facets and many practical uses: Data Provenance 
<https://wiki.evolveum.com/display/midPoint/Data+Provenance>. There is 
one big problem that is common to data protection and identity 
management. It is problem of data /origin/ or /provenance/. The problem 
can be described by something that every identity engineer knows only 
too well: /In a sufficiently large system nobody has any idea where the 
data came from and how they ended up here./ There are too many source 
systems, mappings, data transformations and information flows that the 
resulting system resembles proverbial Labyrinth.

The provenance problem is causing a lot of troubleshooting nightmares. 
This problem slows down IDM deployments and complicates the maintenance. 
But it is a complete disaster for data protection. /Accountability/ is 
one of the basic pillars of data protection. And how good is your 
accountability if you have no idea where your data came from?

We had the provenance problems in our sights for a really long time. In 
fact, one of the earliest data structures we are using to manage 
identity data contains a notion of /origin/. But we have realized quite 
early this is much more difficult than it seems. The ideas were brewing 
in our minds for quite a long time. But now we hope it is finally the 
time to do this, and to do it properly. Therefore, we plan to implement 
data provenance features in a couple of next midPoint versions. This is 
still not completely certain. There are sill some variables, including 
the most important enabler: funding. But our hopes are high. Because 
some things /are/ certain. Such as the importance of data protection. 
For all of us.

(Reposted from Evolveum blog 
<https://evolveum.com/plans-for-data-provenance/>)

-- 
Radovan Semancik
Software Architect
evolveum.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20200128/af1dff8f/attachment.htm>


More information about the midPoint mailing list