[midPoint] Blog: Plans for Data Provenance
Radovan Semancik
radovan.semancik at evolveum.com
Tue Jan 28 10:14:18 CET 2020
Dear midPoint community,
Today is a Data Protection Day
<https://en.wikipedia.org/wiki/Data_Privacy_Day>, which is a very
symbolic day for midPoint. We are taking data protection and privacy
very seriously. We believe that privacy in the cyberspace is necessary
for the free society to flourish. Despite such belief, we acknowledge
the implementation of privacy and data protection may not be easy. But
we are not afraid of challanges. We are fully committed to implement
privacy and data protection features in midPoint.
MidPoint was still quite young when we have realized that data
protection and identity management are in a very intimate relationship.
Identity management and governance system are in a perfect position to
control the flow of identity data. And the essence of data protection is
about controlling the flow and especially the /use/ of data. In fact, we
believe that any practical data protection solution must be supported by
identity management infrastructure. Many people see data protection as
liability. But we believe that data protection can be turned into a
substantial advantage when it is implemented properly.
This belief led us to several experiments with data protection
functionality. We have started several years ago. We presented some of
the results at FOSDEM’18 <https://evolveum.com/fosdem-2018/>. We
implemented several experimental features for data protection, such as
consent management and even more general management of lawful bases for
data processing
<https://wiki.evolveum.com/pages/viewpage.action?pageId=24675100>.
Unfortunately, there was almost no interest for those features in the
industry and we were not able to secure sufficient funding to finish all
of them. Some smaller pieces are implemented, but there is still a long
way to go to get a complete set of data protection functionality.
However, we are not giving up. Now we plan to implement a very important
feature that has many facets and many practical uses: Data Provenance
<https://wiki.evolveum.com/display/midPoint/Data+Provenance>. There is
one big problem that is common to data protection and identity
management. It is problem of data /origin/ or /provenance/. The problem
can be described by something that every identity engineer knows only
too well: /In a sufficiently large system nobody has any idea where the
data came from and how they ended up here./ There are too many source
systems, mappings, data transformations and information flows that the
resulting system resembles proverbial Labyrinth.
The provenance problem is causing a lot of troubleshooting nightmares.
This problem slows down IDM deployments and complicates the maintenance.
But it is a complete disaster for data protection. /Accountability/ is
one of the basic pillars of data protection. And how good is your
accountability if you have no idea where your data came from?
We had the provenance problems in our sights for a really long time. In
fact, one of the earliest data structures we are using to manage
identity data contains a notion of /origin/. But we have realized quite
early this is much more difficult than it seems. The ideas were brewing
in our minds for quite a long time. But now we hope it is finally the
time to do this, and to do it properly. Therefore, we plan to implement
data provenance features in a couple of next midPoint versions. This is
still not completely certain. There are sill some variables, including
the most important enabler: funding. But our hopes are high. Because
some things /are/ certain. Such as the importance of data protection.
For all of us.
(Reposted from Evolveum blog
<https://evolveum.com/plans-for-data-provenance/>)
--
Radovan Semancik
Software Architect
evolveum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20200128/af1dff8f/attachment.htm>
More information about the midPoint
mailing list