[midPoint] Users list is slow with large numbers of users
Pavol Mederly
mederly at evolveum.com
Tue Jul 31 11:18:14 CEST 2018
Hello,
the fix is implemented on support-3.7 branch. Please do not forget to
apply DB migration scripts.
Best regards,
Pavol Mederly
Software developer
evolveum.com
On 30.07.2018 19:24, Pavol Mederly wrote:
>
> Andrew,
>
> I think I got it. It seems to be a difference between 3.7 and 3.8, indeed.
>
> The query in 3.7 is like this:
>
> select
> ruser0_.oid as col_0_0_,
> ruser0_2_.fullObject as col_1_0_,
> ruser0_2_.stringsCount as col_2_0_,
> ruser0_2_.longsCount as col_3_0_,
> ruser0_2_.datesCount as col_4_0_,
> ruser0_2_.referencesCount as col_5_0_,
> ruser0_2_.polysCount as col_6_0_,
> ruser0_2_.booleansCount as col_7_0_
> from
> m_user ruser0_
> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
> inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
> *order by **
> ** ruser0_2_.name_orig asc **
> *limit ?
>
> and in 3.8 it is:
>
> select
> ruser0_.oid as col_0_0_,
> ruser0_2_.fullObject as col_1_0_,
> ruser0_2_.stringsCount as col_2_0_,
> ruser0_2_.longsCount as col_3_0_,
> ruser0_2_.datesCount as col_4_0_,
> ruser0_2_.referencesCount as col_5_0_,
> ruser0_2_.polysCount as col_6_0_,
> ruser0_2_.booleansCount as col_7_0_
> from
> m_user ruser0_
> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
> inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
> *order by**
> ** ruser0_.name_orig asc**
> *limit 20
>
> Note the difference. The 3.7 version orders on a row in doubly joined
> table, whereas the 3.8 version orders on a row in the "main" table.
> When executing against my test database (10K rows, each having ~100 KB
> of XML data), the former takes 25 seconds, whereas the latter takes 46
> milliseconds.
>
> Having seen this I remember that our colleague Viliam really
> implemented querying over "local" name column in 3.8, instead of the
> topmost one. :) Maybe this was the reason. Anyway, tomorrow I will try
> to find the least painful way how to bring this change into 3.6/3.7;
> and will let you know.
>
> Best regards,
>
> Pavol Mederly
> Software developer
> evolveum.com
> On 30.07.2018 17:58, Andrew Morgan wrote:
>> Pavol,
>>
>> I have attached the section of midpoint.log as requested. As you can
>> see, it takes about 20 seconds to display the results in the GUI.
>>
>> When I run the same query in the mysql client, is says, "20 rows in
>> set (13.93 sec)".
>>
>> Here is the EXPLAIN for it again:
>>
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>> | id | select_type | table | type | possible_keys |
>> key | key_len | ref | rows |
>> Extra |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>> | 1 | SIMPLE | ruser0_ | index | PRIMARY |
>> uc_user_name | 768 | NULL | 100970 | Using
>> index; Using temporary; Using filesort |
>> | 1 | SIMPLE | ruser0_1_ | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1 | Using
>> index |
>> | 1 | SIMPLE | ruser0_2_ | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1
>> | |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>>
>> Our database server is CentOS 7.5 running
>> MariaDB-server-10.2.16-1.el7.centos.x86_64. The VM has 4GB of RAM
>> and 2 CPUs (Xeon Gold 5118 CPU @ 2.30GHz). We haven't done much
>> tuning to it yet:
>>
>> [mysqld]
>> innodb_buffer_pool_size = 2G
>> max_allowed_packet = 256M
>>
>>
>> Thanks for helping us investigate!
>>
>> Andy Morgan
>> Systems Administrator, Identity & Access Management
>> Information Services | Oregon State University
>> 541-737-8877 | is.oregonstate.edu
>>
>> On Mon, 30 Jul 2018, Pavol Mederly wrote:
>>
>>> Andrew, Petr,
>>>
>>> my observations on my developer machine running midPoint (master)
>>> and MySQL 5.7 locally are here:
>>>
>>> I have imported 100K users (almost empty objects, only a name,
>>> nothing more). After listing them via User List page the response is
>>> almost immediate. Profiling says:
>>>
>>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] DEBUG
>>> (PROFILING): #### Entry: 602
>>> ...model.impl.controller.ModelController->searchObjects
>>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] TRACE
>>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M:
>>> 20,ORD: [name ASCENDING], ,,
>>> [ObjectOperationOptions(/:distinct=true)],
>>> Task(id:1532962568973-0-1, name:null, oid:null),
>>> R(com.evolveum.midpoint.web.component.data.SelectableBeanObjectDataProvider.searchObjects
>>> UNKNOWN null))
>>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>>> (PROFILING): #### Entry: 604
>>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects
>>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE
>>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M:
>>> 20,ORD: [name ASCENDING], ,,
>>> [ObjectOperationOptions(/:distinct=true)],
>>> R(com.evolveum.midpoint.model.api.ModelService.searchObjects UNKNOWN
>>> null))
>>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>>> (org.hibernate.SQL): select ruser0_.oid as col_0_0_,
>>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as
>>> col_2_0_, ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as
>>> col_4_0_, ruser0_2_.referencesCount as col_5_0_,
>>> ruser0_2_.polysCount as col_6_0_, ruser0_2_.booleansCount as
>>> col_7_0_ from m_user ruser0_ inner join m_focus ruser0_1_ on
>>> ruser0_.oid=ruser0_1_.oid inner join m_object ruser0_2_ on
>>> ruser0_.oid=ruser0_2_.oid order by ruser0_.name_orig asc limit ?
>>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>>> (PROFILING): ##### Exit: 604
>>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects etime: 6.079 ms
>>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE
>>> (PROFILING): ###### retval:
>>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user
>>> 100000),...]
>>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] DEBUG
>>> (PROFILING): ##### Exit: 602
>>> ...model.impl.controller.ModelController->searchObjects etime:
>>> 21.325 ms
>>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] TRACE
>>> (PROFILING): ###### retval:
>>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user
>>> 100000),...]
>>>
>>> The time needed to get first 20 user records was 21 milliseconds.
>>> The repository processing itself took about 6 milliseconds.
>>>
>>> The SQL command itself was:
>>>
>>> select
>>> ruser0_.oid as col_0_0_,
>>> ruser0_2_.fullObject as col_1_0_,
>>> ruser0_2_.stringsCount as col_2_0_,
>>> ruser0_2_.longsCount as col_3_0_,
>>> ruser0_2_.datesCount as col_4_0_,
>>> ruser0_2_.referencesCount as col_5_0_,
>>> ruser0_2_.polysCount as col_6_0_,
>>> ruser0_2_.booleansCount as col_7_0_
>>> from
>>> m_user ruser0_
>>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
>>> inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
>>> order by
>>> ruser0_.name_orig asc
>>> limit ?
>>>
>>> So, yes, the command is the same as in your case.
>>>
>>> Please, could you do the same profiling as I did? I.e. in your
>>> config.xml file please set up:
>>>
>>> <configuration>
>>> <midpoint>
>>> ...
>>> <profilingEnabled>true</profilingEnabled>
>>> </midpoint>
>>> </configuration>
>>>
>>> And restart midPoint after the change. Then in the system
>>> configuration (GUI: System -> Profiling) set the profiling:
>>>
>>> [cid:part1.EDEB0394.06F63FFB at evolveum.com]
>>>
>>> You can also set logging for org.hibernate.SQL to ALL (or TRACE) to
>>> see SQL commands. And then please try opening the user list and send
>>> us the relevant parts of midpoint.log file (i.e. text between Entry
>>> and Exit of "...model.impl.controller.ModelController->searchObjects").
>>>
>>> 3.7.2 vs. 3.8 vs. 3.9-SNAPSHOT should make no difference in this
>>> request, as far as I know. And I think that the user objects' size
>>> should not perhaps make such a dramatic difference. I will have a
>>> look at it more, however.
>>>
>>> Best regards,
>>>
>>> Pavol Mederly
>>> Software developer
>>> evolveum.com
>>>
>>>
>>> On 28.07.2018 16:51, Pavol Mederly wrote:
>>> Andrew,
>>>
>>> now I see I am going to have a look at it anyhow:
>>> https://jira.evolveum.com/browse/MID-4776. I will let you know.
>>>
>>> Pavol Mederly
>>> Software developer
>>> evolveum.com
>>>
>>> On 28.07.2018 10:39, Pavol Mederly wrote:
>>> Hello Andrew,
>>>
>>> how long does it take to display the user list (in GUI) in your case?
>>>
>>> I know the query is like you say - and we know it can be optimized -
>>> but in reality the SQL command sent to the database should limit the
>>> results to first 20 records, so it should be faster than processing
>>> SQL query without limits. (Anyway, we can sort this out if needed.)
>>>
>>> As for the columns, please have a look at "Custom columns
>>> configuration" in
>>> https://wiki.evolveum.com/display/midPoint/Admin+GUI+Configuration.
>>>
>>> Hope this helps,
>>>
>>> Pavol Mederly
>>> Software developer
>>> evolveum.com
>>>
>>> On 27.07.2018 18:03, Andrew Morgan wrote:
>>> We are running midPoint v3.7.2 (we delayed upgrading to v3.8 because
>>> the Google connector is broken). We recently loaded about 110,000
>>> users in midPoint. When I click on Users > List Users, midPoint is
>>> waiting for the results of this database query:
>>>
>>> MariaDB [midpointdev]> explain select ruser0_.oid as col_0_0_,
>>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as
>>> col_2_0_, ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as
>>> col_4_0_, ruser0_2_.referencesCount as col_5_0_,
>>> ruser0_2_.polysCount as col_6_0_, ruser0_2_.booleansCount as
>>> col_7_0_ from m_user ruser0_ inner join m_focus ruser0_1_ on
>>> ruser0_.oid=ruser0_1_.oid inner join m_object ruser0_2_ on
>>> ruser0_.oid=ruser0_2_.oid order by ruser0_2_.name_orig asc;
>>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>>
>>> | id | select_type | table | type | possible_keys |
>>> key | key_len | ref | rows |
>>> Extra |
>>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>>
>>> | 1 | SIMPLE | ruser0_ | index | PRIMARY |
>>> uc_user_name | 768 | NULL | 100839 | Using
>>> index; Using temporary; Using filesort |
>>> | 1 | SIMPLE | ruser0_1_ | eq_ref | PRIMARY |
>>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1 | Using
>>> index |
>>> | 1 | SIMPLE | ruser0_2_ | eq_ref | PRIMARY |
>>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1
>>> | |
>>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>>
>>>
>>> Without the aliases, it is a little easier to read:
>>>
>>> MariaDB [midpointdev]> explain select m_user.oid,
>>> m_object.fullObject, m_object.stringsCount, m_object.longsCount,
>>> m_object.datesCount, m_object.referencesCount, m_object.polysCount,
>>> m_object.booleansCount from m_user inner join m_focus on
>>> m_user.oid=m_focus.oid inner join m_object on
>>> m_user.oid=m_object.oid order by m_object.name_orig asc;
>>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>>
>>> | id | select_type | table | type | possible_keys |
>>> key | key_len | ref | rows |
>>> Extra |
>>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>>
>>> | 1 | SIMPLE | m_user | index | PRIMARY |
>>> uc_user_name | 768 | NULL | 100839 | Using
>>> index; Using temporary; Using filesort |
>>> | 1 | SIMPLE | m_focus | eq_ref | PRIMARY |
>>> PRIMARY | 110 | midpointdev.m_user.oid | 1 | Using
>>> index |
>>> | 1 | SIMPLE | m_object | eq_ref | PRIMARY |
>>> PRIMARY | 110 | midpointdev.m_user.oid | 1
>>> | |
>>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>>
>>>
>>> The result of this query is about 440MB of data. "111690 rows in
>>> set (22.84 sec)" (BTW, join on m_focus isn't needed in this query,
>>> right?)
>>>
>>> That is a LOT of data to push around anytime the Users list is
>>> loaded. We haven't even loaded all of our users yet! Our total
>>> user population is around 350,000.
>>>
>>> Is there a way to set some defaults for the User list? For example,
>>> could I change it to not display anything until a search is
>>> performed? Ideally, it wouldn't try to fetch m_object.fullObject,
>>> at least until some filter is added.
>>>
>>> On a somewhat related note, I'd like to change which columns are
>>> displayed in the User list. For example, I would remove Fullname
>>> and Email and add one of our custom attributes
>>> (extension/username). Is that possible?
>>>
>>> Thanks,
>>>
>>> Andy Morgan
>>> Systems Administrator, Identity & Access Management
>>> Information Services | Oregon State University
>>> 541-737-8877 | is.oregonstate.edu
>>> _______________________________________________
>>> midPoint mailing list
>>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>>
>>> _______________________________________________
>>> midPoint mailing list
>>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>>
>>> _______________________________________________
>>> midPoint mailing list
>>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>>
>>>
>>
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180731/66c8e8bd/attachment.htm>
More information about the midPoint
mailing list