[midPoint] The Story of MidPoint's Super Scalability
Evolveum Marketing
vera at evolveum.com
Fri Dec 17 17:01:01 CET 2021
Dear midPoint community,
As we have announced before, the midScale project has finished lately.
Step into the weekend with a thrilling story of midPoint’s super
scalability possibilities and the challenges we met on the way!
MidScale <https://docs.evolveum.com/midpoint/projects/midscale/> project
has finished lately. The project aimed at the increase of midPoint
scalability, performance and manageability to support large and complex
midPoint deployments. The project was a success! Yet, it was far from
being easy.
When midPoint started a decade ago, the primary target was a mid-size
enterprise with thousands of identities to manage. It made a perfect
sense back then, that was the scale we could handle – both from business
and technology perspective. However, the world is a different place now.
Deployments reaching beyond millions of managed identities are much more
common. As our customers have changed, we have changed as well. MidPoint
had to adapt to the new environment.
We have been working on midPoint performance improvements for years.
First results were delivered in 2018 when “Watt
<https://docs.evolveum.com/midpoint/release/3.8/>” was released.
However, at that time, we have fully realized that there is a component
limiting out the potential. MidPoint data storage layer (which we call
“repository”) was built in a generic way, supporting several database
engines. However, every abstraction has its cost. Supporting many
databases with the same code meant that we are doomed to mediocrity. It
was very difficult to take advantage of any database-specific features.
Every improvement we made had to be implemented and tested for all the
supported databases. The effort was prohibitively high, and the results
were somehow disappointing. We realized that this was not the way to go.
The way forward was quite clear. As the support for many databases
dragged us down, we had to specialize on a single database. The choice
of the database engine was quite clear as well. MidPoint is open source
platform, therefore we had to choose open source database. PostgreSQL
was an obvious choice. The approach was clear as well. A decade ago,
when midPoint was designed, we anticipated that we may need to re-work
our “repository” code. In fact, that had already happened once.
Therefore, the plan was to do it again. This time, we would take the
full advantage of PostgreSQL features. We had everything we needed.
Except for two little things, those two notorious troublemakers: time
and money.
Fortune favors the prepared. In 2019 we came across NGI_TRUST
<https://www.ngi.eu/ngi-projects/ngi-trust/>. We had very little
experience with European community funding, and coming from Eastern
Europe, most of the experiences were quite negative. Therefore we did
not know what to expect. However, NGI_TRUST looked good, and we decided
to submit a proposal. The proposal was accepted, and the MidPrivacy:
Data Provenance Prototype
<https://docs.evolveum.com/midpoint/projects/midprivacy/phases/01-data-provenance-prototype/>
project started. The project went well, and it was a success. After
that, we were prepared for a bigger challenge. We took the chance, and
we submitted a proposal for MidScale
<https://docs.evolveum.com/midpoint/projects/midscale/>. The proposal
was not accepted immediately and the committee kept us in suspense for
quite some time. Fortunately, the proposal was accepted at last, and the
project took off.
The project was a challenge from the beginning. Due to various reasons,
we got the green light a month later than originally planned. This was a
complication, as the original plan was to synchronize the project with
midPoint development cycle. Also, midScale was meant to be the very last
project of the funding program, therefore our project had to be finished
exactly on time, not a day later. This has stirred the project plan at
the very beginning of the project. Yet, due to rules given by funding,
we were not able to change the plan. This created a challenge that
rolled through the entire project, from milestone to milestone. We have
added few more people (including myself) to the project, completely
funded by Evolveum, on top of original budget. This helped to smooth out
the project progress, and we were back on track. With a good deal of
flexibility, management acrobatics, and a dash of personal heroism, we
have managed to keep things going according to plan.
Of course, the repository replacement was the most challenging part of
the project. We have never expected this to be easy. However, the amount
of work was still quite surprising, more than we expected. More
flexibility, management acrobatics and heroism did it, and at the end we
had brand-new, lemon-scented, native PostgreSQL repository
implementation
<https://docs.evolveum.com/midpoint/reference/repository/native-postgresql/>.
While the repository was a crucial part, it would not boost up midPoint
scalability just by itself. We have significantly improved (read:
reworked beyond recognition) management of distributed tasks, improving
horizontal scalability. There were performance improvements in almost
every part of midPoint, from the low-level data representation libraries
all the way to the user interface. The error detection and handling was
improved, many bugs fixed, including those nasty multi-threading issues,
improving robustness. MidPoint is much more scalable, faster and more
reliable system now.
However, much more than raw power is needed to run a large-scale
identity management and governance deployment. Identity management is,
quite obviously, all about management of identities. Therefore we had to
improve manageability and overall visibility of midPoint. There are
numerous diagnostic improvements in many parts of the system, most
notably in the task management subsystem. A brand-new Axiom Query
Language
<https://docs.evolveum.com/midpoint/reference/concepts/query/axiom-query-language/>
was designed and implemented, providing ability to construct complex
queries in a (reasonably) human-friendly way. User interface was
improved, providing much better user experience. On top of the original
project plan, there are improved dashboards and native reports. New
connectors can be auto-loaded now, reducing downtime. Large midPoint
deployments are much easier to manage than they were a year ago.
None of this would be possible without testing. We have had automated
tests for ages. However, the tests mostly focused on functionality.
There was only a handful performance-oriented tests, and we could not
even do much more in our rudimentary testing environment. Design and
buildup of the new testing environment
<https://docs.evolveum.com/midpoint/projects/midscale/infrastructure/>
was an essential activity in midScale project. The environment turned up
to be much better than we expected, yet it was also much harder to build
it. It took a lot of time, with several improvement rounds. This was
supplemented with major improvements to Schrödinger
<https://docs.evolveum.com/midpoint/tools/schrodinger/>, the framework
for automated testing of user interface. MidPoint user interface is
quite a big and complicated piece, Schrödinger was a crucial component
to keep it in working condition. At the end, we got excellent testing
results
<https://docs.evolveum.com/midpoint/projects/midscale/performance-scalability-test-results/>.
It is officially confirmed that midPoint is much better now and ready
for the future.
MidScale project was finished on time and with excellent results. Due to
the management acrobatics, the project did not end with midPoint
release, but the last milestone was a release candidate. There were
still some bugfixes to do before midPoint could be released.
At last, midPoint 4.4 “Tesla”
<https://docs.evolveum.com/midpoint/release/4.4/> has been released
lately. Tesla follows up on Faraday
<https://docs.evolveum.com/midpoint/release/4.3/> release, which brought
some results of midScale project to the community. MidPoint 4.4 “Tesla”
<https://docs.evolveum.com/midpoint/release/4.4/> will be a major
milestone in midPoint history. It is also a long-term support
<https://docs.evolveum.com/support/long-term-support/> release,
therefore Tesla will be with us for quite a long time.
MidScale project has been completed, yet the work continues. This is
only the start. We will further improve midPoint in following releases.
There is also a lot of work on business side, documentation, practices,
and lot of other things. Software development never ends.
(Written by Radovan Semancik, reposted from Evolveum blog
<https://evolveum.com/midscale-is-finished/>)
--
Veronika Kolpascikova
Marketing Specialist
evolveum.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20211217/928452a7/attachment.htm>
More information about the midPoint
mailing list