[midPoint] Updates can get lost during a running recomputation task (SOLVED)
Pavol Mederly
mederly at evolveum.com
Wed Feb 7 14:02:02 CET 2018
> Not sure if "fetches objects one-after-another" makes the picture
> clear. As i understand it the default reading workflow goes in a
> single query - all objects with full details in a single query/result
> set that is processed one by one by the handlers. Don't know how
> fetching rows from the result set works.
It is quite easy to explain. Please look here
<https://github.com/Evolveum/midpoint/blob/54112a0ad266f8cd3f3024111a195fb064f79ae6/repo/repo-common/src/main/java/com/evolveum/midpoint/repo/common/task/AbstractSearchIterativeTaskHandler.java#L289>:
repositoryService.*searchObjectsIterative*((Class<O>) type, query,
resultHandler, searchOptions, false, opResult);
The searchObjectsIterative method starts a search operation, and each
object - as soon as it's returned from the repository - is handled by
the resultHandler. (Using ScrollableResults - as can be seen here
<https://github.com/Evolveum/midpoint/blob/42a1a66e93347d8c8b30624a574e7dfaf3743e88/repo/repo-sql-impl/src/main/java/com/evolveum/midpoint/repo/sql/helpers/ObjectRetriever.java#L680>.)
I do not know how this works internally in hibernate, JDBC driver and
DBMS itself. But I suppose that if there's any
caching/chunking/prefetching there, it does not gather all objects
before processing them.
Anyway, I think we can implement the OID processing. (But it's not me
who decides about the budgets :))
Pavol Mederly
Software developer
evolveum.com
On 07.02.2018 13:45, Arnošt Starosta - AMI Praha a.s. wrote:
> Hi Pavol,
>
> that unintended workaround saved my life for the moment .)
>
> Not sure if "fetches objects one-after-another" makes the picture
> clear. As i understand it the default reading workflow goes in a
> single query - all objects with full details in a single query/result
> set that is processed one by one by the handlers. Don't know how
> fetching rows from the result set works.
>
> Tweaking the transaction isolation did not really help, even with
> default set to 'read committed'. Thats why i think the object
> 'fetching' happens in larger chunks and may not be affected by weaker
> transaction isolation. Or maybe i just misconfigured.
>
> Working with oids in iterative tasks would be great! You want the
> worker threads to process 'that object' not 'this chunk of data'.
>
> The jira is already there - https://jira.evolveum.com/browse/MID-4414
>
> arnost
>
> 2018-02-07 12:17 GMT+01:00 Pavol Mederly <mederly at evolveum.com
> <mailto:mederly at evolveum.com>>:
>
> Hello Arnošt,
>
> this is a good observation.
>
> To be honest, iterative search by paging was meant as a workaround
> for databases that do not support search with subsequent modify
> operations on the returned objects. But, as we see from your
> message, it can be used to avoid these problems as well :)
>
> Just a slight correction:
>
>> Midpoint in default configuration recomputes objects by first
>> retrieving them ALL from repository, then passing each object to
>> a worker thread.
> This is not quite true. MidPoint fetches objects
> one-after-another, and just after fetching each one from the
> repository it passes the object to a worker thread (or processes
> it directly if there are no worker threads defined). However,
> because of quite strong transaction isolation setting
> (serializable) the DBMS ensures that changes that occur on objects
> after the transaction started (i.e. after the search was started)
> are not reflected in their values.
>
> I can imagine an option that would make this more optimized. E.g.
> by retrieving just a list of OIDs and reading each object just
> before its processing. If you have a second of free time, you
> could create a jira for this.
>
> Moreover, in 3.8 we loose transaction isolation a bit, from
> serializable to repeatable_read. But I think this will not change
> this behavior.
>
> Pavol Mederly
> Software developer
> evolveum.com <http://evolveum.com>
>
> On 29.01.2018 13:22, Arnošt Starosta - AMI Praha a.s. wrote:
>> *Problem : *
>>
>> Midpoint in default configuration recomputes objects by first
>> retrieving them ALL from repository, then passing each object to
>> a worker thread. If the object was updated meanwhile (e.g.
>> live-synced or updated from gui) before it is recomputed by the
>> worker thread, this update can be overwritten by the object
>> version retrieved when the recompute task started. It happened on
>> my deployment several times.
>>
>> *Is your deployment affected? :*
>>
>> Hard to say, i don't see any relevant log message to check. I had
>> to check by debugging the running recompute task and verifying
>> that SqlRepositoryServiceImpl.searchObjectsIterative calls
>> ObjectRetriever.searchObjectsIterativeByPaging (ok) and not
>> ObjectRetriever.searchObjectsIterativeAttempt (can loose updates).
>>
>> Deployments with MySQL or H2 backend should be ok with default
>> configuration (check sources
>> SqlRepositoryConfiguration.computeDefaultIterativeSearchParameters).
>> Did not verify the runtime.
>>
>> *Solution:*
>>
>> Configure iterativeSearchByPaging and
>> iterativeSearchByPagingBatchSize in config.xml
>> midpoint/repository element. Don't know if all backends support
>> this setting but postgres (which i use) does.
>>
>> <configuration>
>>
>> <midpoint>
>>
>> <repository>
>>
>> …
>>
>> <iterativeSearchByPaging>true</iterativeSearchByPaging>
>>
>>
>> <iterativeSearchByPagingBatchSize>17</iterativeSearchByPagingBatchSize>
>>
>> …
>>
>> </repository>
>>
>> </midpoint>
>>
>> </configuration>
>>
>>
>> After setting these parameters the objects to recompute are read
>> in 'pages' and fed to worker threads until the request queue
>> between the reader thread and worker threads is full, then the
>> reader is blocked. The size of the queue is hardcoded as 2 *
>> number-of-worker-threads.
>>
>> By setting the iterativeSearchByPagingBatchSize you can still
>> loose updates, but the time window when this can happen shrinks
>> from number-of-objects to max(page size,
>> 2*num-of-worker-threads). Without much thought i set the page
>> size to (2 * number-of-worker-threads) + 1.
>>
>> good luck
>> arnost
>>
>> --
>>
>> Arnošt Starosta
>> solution architect
>>
>> gsm: [+420] 603 794 932 <tel:+420%20603%20794%20932>
>> e-mail: arnost.starosta at ami.cz <mailto:arnost.starosta at ami.cz>
>>
>>
>>
>> AMI Praha a.s.
>> Pláničkova 11
>> 162 00 Praha 6
>> tel.: [+420] 274 783 239 <tel:+420%20274%20783%20239>
>> web: www.ami.cz <http://www.ami.cz/>
>>
>>
>>
>> AMI Praha a.s.
>>
>>
>> AMI Praha a.s.
>> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>>
>>
>> Textem tohoto e-mailu podepisující neslibuje uzavřít ani
>> neuzavírá za společnost AMI Praha a.s.
>> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít
>> výhradně písemnou formu.
>>
>>
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com <mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
> <http://lists.evolveum.com/mailman/listinfo/midpoint>
>
>
>
>
> --
>
> Arnošt Starosta
> solution architect
>
> gsm: [+420] 603 794 932
> e-mail: arnost.starosta at ami.cz <mailto:arnost.starosta at ami.cz>
>
>
>
> AMI Praha a.s.
> Pláničkova 11
> 162 00 Praha 6
> tel.: [+420] 274 783 239
> web: www.ami.cz <http://www.ami.cz/>
>
>
>
> AMI Praha a.s.
>
>
> AMI Praha a.s.
> <http://www.ami.cz/reseni-a-sluzby/bezpecnost-dat/identity-management>
>
> Textem tohoto e-mailu podepisující neslibuje uzavřít ani neuzavírá za
> společnost AMI Praha a.s.
> jakoukoliv smlouvu. Každá smlouva, pokud bude uzavřena, musí mít
> výhradně písemnou formu.
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180207/afff2584/attachment.htm>
More information about the midPoint
mailing list