[midPoint] Updates can get lost during a running recomputation task (SOLVED)
Pavol Mederly
mederly at evolveum.com
Wed Feb 7 15:04:55 CET 2018
Wow. This is a bit beyond my current competence. I am afraid that the
only way how to check it is to try it. :)
If you can build midPoint from sources, could you just switch the scroll
mode to TYPE_SCROLL_SENSITIVE and try that? I could write a special test
to do that but ... :) hard to find the time.
Pavol Mederly
Software developer
evolveum.com
On 07.02.2018 14:41, Arnošt Starosta - AMI Praha a.s. wrote:
> That might be the root of the problem!
>
> The searchObjectsIterative method starts a search operation, and
> each object - as soon as it's returned from the repository - is
> handled by the resultHandler. (Using ScrollableResults - as can be
> seen here
> <https://github.com/Evolveum/midpoint/blob/42a1a66e93347d8c8b30624a574e7dfaf3743e88/repo/repo-sql-impl/src/main/java/com/evolveum/midpoint/repo/sql/helpers/ObjectRetriever.java#L680>.)
>
> where 'here' is
>
> ScrollableResults results = rQuery.scroll(ScrollMode.FORWARD_ONLY);
>
> The jdbc spec says about TYPE_FORWARD_ONLY
>
> "The rows contained in the result set depend on how the underlying
> database materializes the results. That is, it contains the rows that
> satisfy the query at either the time the query is executed or as the
> rows are retrieved"
>
> If we wanted to see the changes, we would have to use
> TYPE_SCROLL_SENSITIVE.
>
> What I don't understand is how this plays together with transaction
> isolation settings. Does specifying ResultSet type override them or is
> it the other way around? No time to read the whole spec :/
>
> arnost
>
> I do not know how this works internally in hibernate, JDBC driver
> and DBMS itself. But I suppose that if there's any
> caching/chunking/prefetching there, it does not gather all objects
> before processing them.
>
> Anyway, I think we can implement the OID processing. (But it's not
> me who decides about the budgets :))
>
> Pavol Mederly
> Software developer
> evolveum.com <http://evolveum.com>
>
> On 07.02.2018 13:45, Arnošt Starosta - AMI Praha a.s. wrote:
>> Hi Pavol,
>>
>> that unintended workaround saved my life for the moment .)
>>
>> Not sure if "fetches objects one-after-another" makes the picture
>> clear. As i understand it the default reading workflow goes in a
>> single query - all objects with full details in a single
>> query/result set that is processed one by one by the handlers.
>> Don't know how fetching rows from the result set works.
>>
>> Tweaking the transaction isolation did not really help, even with
>> default set to 'read committed'. Thats why i think the object
>> 'fetching' happens in larger chunks and may not be affected by
>> weaker transaction isolation. Or maybe i just misconfigured.
>>
>> Working with oids in iterative tasks would be great! You want the
>> worker threads to process 'that object' not 'this chunk of data'.
>>
>> The jira is already there -
>> https://jira.evolveum.com/browse/MID-4414
>> <https://jira.evolveum.com/browse/MID-4414>
>>
>> arnost
>>
>> 2018-02-07 12:17 GMT+01:00 Pavol Mederly <mederly at evolveum.com
>> <mailto:mederly at evolveum.com>>:
>>
>> Hello Arnošt,
>>
>> this is a good observation.
>>
>> To be honest, iterative search by paging was meant as a
>> workaround for databases that do not support search with
>> subsequent modify operations on the returned objects. But, as
>> we see from your message, it can be used to avoid these
>> problems as well :)
>>
>> Just a slight correction:
>>
>>> Midpoint in default configuration recomputes objects by
>>> first retrieving them ALL from repository, then passing each
>>> object to a worker thread.
>> This is not quite true. MidPoint fetches objects
>> one-after-another, and just after fetching each one from the
>> repository it passes the object to a worker thread (or
>> processes it directly if there are no worker threads
>> defined). However, because of quite strong transaction
>> isolation setting (serializable) the DBMS ensures that
>> changes that occur on objects after the transaction started
>> (i.e. after the search was started) are not reflected in
>> their values.
>>
>> I can imagine an option that would make this more optimized.
>> E.g. by retrieving just a list of OIDs and reading each
>> object just before its processing. If you have a second of
>> free time, you could create a jira for this.
>>
>> Moreover, in 3.8 we loose transaction isolation a bit, from
>> serializable to repeatable_read. But I think this will not
>> change this behavior.
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com <http://evolveum.com>
>>
>> On 29.01.2018 13:22, Arnošt Starosta - AMI Praha a.s. wrote:
>>> *Problem : *
>>>
>>> Midpoint in default configuration recomputes objects by
>>> first retrieving them ALL from repository, then passing each
>>> object to a worker thread. If the object was updated
>>> meanwhile (e.g. live-synced or updated from gui) before it
>>> is recomputed by the worker thread, this update can be
>>> overwritten by the object version retrieved when the
>>> recompute task started. It happened on my deployment several
>>> times.
>>>
>>> *Is your deployment affected? :*
>>>
>>> Hard to say, i don't see any relevant log message to check.
>>> I had to check by debugging the running recompute task and
>>> verifying that
>>> SqlRepositoryServiceImpl.searchObjectsIterative calls
>>> ObjectRetriever.searchObjectsIterativeByPaging (ok) and not
>>> ObjectRetriever.searchObjectsIterativeAttempt (can loose
>>> updates).
>>>
>>> Deployments with MySQL or H2 backend should be ok with
>>> default configuration (check sources
>>> SqlRepositoryConfiguration.computeDefaultIterativeSearchParameters).
>>> Did not verify the runtime.
>>>
>>> *Solution:*
>>>
>>> Configure iterativeSearchByPaging and
>>> iterativeSearchByPagingBatchSize in config.xml
>>> midpoint/repository element. Don't know if all backends
>>> support this setting but postgres (which i use) does.
>>>
>>> <configuration>
>>>
>>> <midpoint>
>>>
>>> <repository>
>>>
>>> …
>>>
>>> <iterativeSearchByPaging>true</iterativeSearchByPaging>
>>>
>>>
>>> <iterativeSearchByPagingBatchSize>17</iterativeSearchByPagingBatchSize>
>>>
>>> …
>>>
>>> </repository>
>>>
>>> </midpoint>
>>>
>>> </configuration>
>>>
>>>
>>> After setting these parameters the objects to recompute are
>>> read in 'pages' and fed to worker threads until the request
>>> queue between the reader thread and worker threads is full,
>>> then the reader is blocked. The size of the queue is
>>> hardcoded as 2 * number-of-worker-threads.
>>>
>>> By setting the iterativeSearchByPagingBatchSize you can
>>> still loose updates, but the time window when this can
>>> happen shrinks from number-of-objects to max(page size,
>>> 2*num-of-worker-threads). Without much thought i set the
>>> page size to (2 * number-of-worker-threads) + 1.
>>>
>>> good luck
>>> arnost
>>>
>>> --
>>>
>>> Arnošt Starosta
>>> solution architect
>>>
>>> gsm: [+420] 603 794 932 <tel:+420%20603%20794%20932>
>>> e-mail: arnost.starosta at ami.cz <mailto:arnost.starosta at ami.cz>
>>>
>>>
>>>
>>> AMI Praha a.s.
>>> Pláničkova 11
>>> 162 00 Praha 6
>>> tel.: [+420] 274 783 239 <tel:+420%20274%20783%20239>
>>> web: www.ami.cz <http://www.ami.cz/>
>>>
>>>
>>>
>>> AMI Praha a.s.
>>>
>>>
>>> AMI Praha a.s.
>>> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>>>
>>>
>>> Textem tohoto e-mailu podepisující neslibuje uzavřít ani
>>> neuzavírá za společnost AMI Praha a.s.
>>> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí
>>> mít výhradně písemnou formu.
>>>
>>>
>>>
>>> _______________________________________________
>>> midPoint mailing list
>>> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
>>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>>
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>>
>>
>>
>>
>> --
>>
>> Arnošt Starosta
>> solution architect
>>
>> gsm: [+420] 603 794 932 <tel:+420%20603%20794%20932>
>> e-mail: arnost.starosta at ami.cz <mailto:arnost.starosta at ami.cz>
>>
>>
>>
>> AMI Praha a.s.
>> Pláničkova 11
>> 162 00 Praha 6
>> tel.: [+420] 274 783 239 <tel:+420%20274%20783%20239>
>> web: www.ami.cz <http://www.ami.cz/>
>>
>>
>>
>> AMI Praha a.s.
>>
>>
>> AMI Praha a.s.
>> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>>
>>
>> Textem tohoto e-mailu podepisující neslibuje uzavřít ani
>> neuzavírá za společnost AMI Praha a.s.
>> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít
>> výhradně písemnou formu.
>>
>>
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>
>
>
>
> --
>
> Arnošt Starosta
> solution architect
>
> gsm: [+420] 603 794 932
> e-mail: arnost.starosta at ami.cz <mailto:arnost.starosta at ami.cz>
>
>
>
> AMI Praha a.s.
> Pláničkova 11
> 162 00 Praha 6
> tel.: [+420] 274 783 239
> web: www.ami.cz <http://www.ami.cz/>
>
>
>
> AMI Praha a.s.
>
>
> AMI Praha a.s.
> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>
> Textem tohoto e-mailu podepisující neslibuje uzavřít ani neuzavírá za
> společnost AMI Praha a.s.
> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít
> výhradně písemnou formu.
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180207/7dca583d/attachment.htm>
More information about the midPoint
mailing list