[midPoint] Updates can get lost during a running recomputation task (SOLVED)

Arnošt Starosta - AMI Praha a.s. arnost.starosta at ami.cz
Wed Feb 7 14:41:16 CET 2018


That might be the root of the problem!

> The searchObjectsIterative method starts a search operation, and each
> object - as soon as it's returned from the repository - is handled by the
> resultHandler. (Using ScrollableResults - as can be seen here
> <https://github.com/Evolveum/midpoint/blob/42a1a66e93347d8c8b30624a574e7dfaf3743e88/repo/repo-sql-impl/src/main/java/com/evolveum/midpoint/repo/sql/helpers/ObjectRetriever.java#L680>
> .)
>
> where 'here' is

ScrollableResults results = rQuery.scroll(ScrollMode.FORWARD_ONLY);

The jdbc spec says about TYPE_FORWARD_ONLY

"The rows contained in the result set depend on how the underlying database
materializes the results. That is, it contains the rows that satisfy the
query at either the time the query is executed or as the rows are retrieved"

If we wanted to see the changes, we would have to use TYPE_SCROLL_SENSITIVE.

What I don't understand is how this plays together with transaction
isolation settings. Does specifying ResultSet type override them or is it
the other way around? No time to read the whole spec :/

arnost


> I do not know how this works internally in hibernate, JDBC driver and DBMS
> itself. But I suppose that if there's any caching/chunking/prefetching
> there, it does not gather all objects before processing them.
>
> Anyway, I think we can implement the OID processing. (But it's not me who
> decides about the budgets :))
>
> Pavol Mederly
> Software developerevolveum.com
>
> On 07.02.2018 13:45, Arnošt Starosta - AMI Praha a.s. wrote:
>
> Hi Pavol,
>
> that unintended workaround saved my life for the moment .)
>
> Not sure if "fetches objects one-after-another" makes the picture clear.
> As i understand it the default reading workflow goes in a single query -
> all objects with full details in a single query/result set that is
> processed one by one by the handlers. Don't know how fetching rows from the
> result set works.
>
> Tweaking the transaction isolation did not really help, even with default
> set to 'read committed'. Thats why i think the object 'fetching' happens in
> larger chunks and may not be affected by weaker transaction isolation. Or
> maybe i just misconfigured.
>
> Working with oids in iterative tasks would be great! You want the worker
> threads to process 'that object' not 'this chunk of data'.
>
> The jira is already there - https://jira.evolveum.com/browse/MID-4414
>
> arnost
>
> 2018-02-07 12:17 GMT+01:00 Pavol Mederly <mederly at evolveum.com>:
>
>> Hello Arnošt,
>>
>> this is a good observation.
>>
>> To be honest, iterative search by paging was meant as a workaround for
>> databases that do not support search with subsequent modify operations on
>> the returned objects. But, as we see from your message, it can be used to
>> avoid these problems as well :)
>>
>> Just a slight correction:
>>
>> Midpoint in default configuration recomputes objects by first retrieving
>> them ALL from repository, then passing each object to a worker thread.
>>
>> This is not quite true. MidPoint fetches objects one-after-another, and
>> just after fetching each one from the repository it passes the object to a
>> worker thread (or processes it directly if there are no worker threads
>> defined). However, because of quite strong transaction isolation setting
>> (serializable) the DBMS ensures that changes that occur on objects after
>> the transaction started (i.e. after the search was started) are not
>> reflected in their values.
>>
>> I can imagine an option that would make this more optimized. E.g. by
>> retrieving just a list of OIDs and reading each object just before its
>> processing. If you have a second of free time, you could create a jira for
>> this.
>>
>> Moreover, in 3.8 we loose transaction isolation a bit, from serializable
>> to repeatable_read. But I think this will not change this behavior.
>>
>> Pavol Mederly
>> Software developerevolveum.com
>>
>> On 29.01.2018 13:22, Arnošt Starosta - AMI Praha a.s. wrote:
>>
>> *Problem : *
>>
>> Midpoint in default configuration recomputes objects by first retrieving
>> them ALL from repository, then passing each object to a worker thread. If
>> the object was updated meanwhile (e.g. live-synced or updated from gui)
>> before it is recomputed by the worker thread, this update can be
>> overwritten by the object version retrieved when the recompute task
>> started. It happened on my deployment several times.
>>
>> *Is your deployment affected? :*
>>
>> Hard to say, i don't see any relevant log message to check. I had to
>> check by debugging the running recompute task and verifying that
>> SqlRepositoryServiceImpl.searchObjectsIterative calls
>> ObjectRetriever.searchObjectsIterativeByPaging (ok) and not
>> ObjectRetriever.searchObjectsIterativeAttempt (can loose updates).
>>
>> Deployments with MySQL or H2 backend should be ok with default
>> configuration (check sources SqlRepositoryConfiguration.com
>> puteDefaultIterativeSearchParameters). Did not verify the runtime.
>>
>> *Solution:*
>>
>> Configure iterativeSearchByPaging and iterativeSearchByPagingBatchSize in
>> config.xml midpoint/repository element. Don't know if all backends
>> support this setting but postgres (which i use) does.
>>
>> <configuration>
>>
>>    <midpoint>
>>
>>        <repository>
>>
>>>>
>>            <iterativeSearchByPaging>true</iterativeSearchByPaging>
>>
>>         <iterativeSearchByPagingBatchSize>17</iterativeSearchByPagin
>> gBatchSize>
>>
>>>>
>>        </repository>
>>
>>    </midpoint>
>>
>> </configuration>
>>
>> After setting these parameters the objects to recompute are read in
>> 'pages' and fed to worker threads until the request queue between the
>> reader thread and worker threads is full, then the reader is blocked. The
>> size of the queue is hardcoded as 2 * number-of-worker-threads.
>>
>> By setting the iterativeSearchByPagingBatchSize you can still loose
>> updates, but the time window when this can happen shrinks from
>> number-of-objects to max(page size, 2*num-of-worker-threads). Without much
>> thought i set the page size to (2 * number-of-worker-threads) + 1.
>>
>> good luck
>> arnost
>>
>> --
>>
>> Arnošt Starosta
>> solution architect
>>
>> gsm: [+420] 603 794 932 <+420%20603%20794%20932>
>> e-mail: arnost.starosta at ami.cz
>>
>>
>> AMI Praha a.s.
>> Pláničkova 11
>> 162 00 Praha 6
>> tel.: [+420] 274 783 239 <+420%20274%20783%20239>
>> web: www.ami.cz
>>
>>
>> [image: AMI Praha a.s.]
>>
>> [image: AMI Praha a.s.]
>> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>>
>> Textem tohoto e-mailu podepisující neslibuje uzavřít ani neuzavírá za
>> společnost AMI Praha a.s.
>> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít výhradně
>> písemnou formu.
>>
>>
>>
>> _______________________________________________
>> midPoint mailing listmidPoint at lists.evolveum.comhttp://lists.evolveum.com/mailman/listinfo/midpoint
>>
>>
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>>
>
>
> --
>
> Arnošt Starosta
> solution architect
>
> gsm: [+420] 603 794 932 <+420%20603%20794%20932>
> e-mail: arnost.starosta at ami.cz
>
>
> AMI Praha a.s.
> Pláničkova 11
> 162 00 Praha 6
> tel.: [+420] 274 783 239 <+420%20274%20783%20239>
> web: www.ami.cz
>
>
> [image: AMI Praha a.s.]
>
> [image: AMI Praha a.s.]
> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>
> Textem tohoto e-mailu podepisující neslibuje uzavřít ani neuzavírá za
> společnost AMI Praha a.s.
> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít výhradně
> písemnou formu.
>
>
>
> _______________________________________________
> midPoint mailing listmidPoint at lists.evolveum.comhttp://lists.evolveum.com/mailman/listinfo/midpoint
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
>


-- 

Arnošt Starosta
solution architect

gsm: [+420] 603 794 932
e-mail: arnost.starosta at ami.cz


AMI Praha a.s.
Pláničkova 11
162 00 Praha 6
tel.: [+420] 274 783 239
web: www.ami.cz


[image: AMI Praha a.s.]

[image: AMI Praha a.s.]
<http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>

Textem tohoto e-mailu podepisující neslibuje uzavřít ani neuzavírá za
společnost AMI Praha a.s.
jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít výhradně
písemnou formu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180207/5e73673a/attachment.htm>


More information about the midPoint mailing list