[midPoint] Users list is slow with large numbers of users

Pavol Mederly mederly at evolveum.com
Mon Jul 30 19:24:18 CEST 2018


Andrew,

I think I got it. It seems to be a difference between 3.7 and 3.8, indeed.

The query in 3.7 is like this:

select
   ruser0_.oid as col_0_0_,
   ruser0_2_.fullObject as col_1_0_,
   ruser0_2_.stringsCount as col_2_0_,
   ruser0_2_.longsCount as col_3_0_,
   ruser0_2_.datesCount as col_4_0_,
   ruser0_2_.referencesCount as col_5_0_,
   ruser0_2_.polysCount as col_6_0_,
   ruser0_2_.booleansCount as col_7_0_
from
   m_user ruser0_
   inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
   inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
*order by **
**  ruser0_2_.name_orig asc **
*limit ?

and in 3.8 it is:

select
   ruser0_.oid as col_0_0_,
   ruser0_2_.fullObject as col_1_0_,
   ruser0_2_.stringsCount as col_2_0_,
   ruser0_2_.longsCount as col_3_0_,
   ruser0_2_.datesCount as col_4_0_,
   ruser0_2_.referencesCount as col_5_0_,
   ruser0_2_.polysCount as col_6_0_,
   ruser0_2_.booleansCount as col_7_0_
from
   m_user ruser0_
   inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
   inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
*order by**
**  ruser0_.name_orig asc**
*limit 20

Note the difference. The 3.7 version orders on a row in doubly joined 
table, whereas the 3.8 version orders on a row in the "main" table. When 
executing against my test database (10K rows, each having ~100 KB of XML 
data), the former takes 25 seconds, whereas the latter takes 46 
milliseconds.

Having seen this I remember that our colleague Viliam really implemented 
querying over "local" name column in 3.8, instead of the topmost one. :) 
Maybe this was the reason. Anyway, tomorrow I will try to find the least 
painful way how to bring this change into 3.6/3.7; and will let you know.

Best regards,

Pavol Mederly
Software developer
evolveum.com

On 30.07.2018 17:58, Andrew Morgan wrote:
> Pavol,
>
> I have attached the section of midpoint.log as requested.  As you can 
> see, it takes about 20 seconds to display the results in the GUI.
>
> When I run the same query in the mysql client, is says, "20 rows in 
> set (13.93 sec)".
>
> Here is the EXPLAIN for it again:
>
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>
> | id   | select_type | table     | type   | possible_keys | 
> key          | key_len | ref                     | rows   | 
> Extra                                        |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>
> |    1 | SIMPLE      | ruser0_   | index  | PRIMARY       | 
> uc_user_name | 768     | NULL                    | 100970 | Using 
> index; Using temporary; Using filesort |
> |    1 | SIMPLE      | ruser0_1_ | eq_ref | PRIMARY       | 
> PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 | Using 
> index                                  |
> |    1 | SIMPLE      | ruser0_2_ | eq_ref | PRIMARY       | 
> PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 
> |                                              |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>
>
> Our database server is CentOS 7.5 running 
> MariaDB-server-10.2.16-1.el7.centos.x86_64.  The VM has 4GB of RAM and 
> 2 CPUs (Xeon Gold 5118 CPU @ 2.30GHz).  We haven't done much tuning to 
> it yet:
>
> [mysqld]
> innodb_buffer_pool_size = 2G
> max_allowed_packet = 256M
>
>
> Thanks for helping us investigate!
>
> Andy Morgan
> Systems Administrator, Identity & Access Management
> Information Services | Oregon State University
> 541-737-8877 | is.oregonstate.edu
>
> On Mon, 30 Jul 2018, Pavol Mederly wrote:
>
>> Andrew, Petr,
>>
>> my observations on my developer machine running midPoint (master) and 
>> MySQL 5.7 locally are here:
>>
>> I have imported 100K users (almost empty objects, only a name, 
>> nothing more). After listing them via User List page the response is 
>> almost immediate. Profiling says:
>>
>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] DEBUG 
>> (PROFILING): #### Entry: 602 
>> ...model.impl.controller.ModelController->searchObjects
>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] TRACE 
>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M: 
>> 20,ORD: [name ASCENDING], ,, 
>> [ObjectOperationOptions(/:distinct=true)], Task(id:1532962568973-0-1, 
>> name:null, oid:null), 
>> R(com.evolveum.midpoint.web.component.data.SelectableBeanObjectDataProvider.searchObjects 
>> UNKNOWN null))
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG 
>> (PROFILING): #### Entry: 604 
>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE 
>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M: 
>> 20,ORD: [name ASCENDING], ,, 
>> [ObjectOperationOptions(/:distinct=true)], 
>> R(com.evolveum.midpoint.model.api.ModelService.searchObjects UNKNOWN 
>> null))
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG 
>> (org.hibernate.SQL): select ruser0_.oid as col_0_0_, 
>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_, 
>> ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_, 
>> ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as 
>> col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_ 
>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join 
>> m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by 
>> ruser0_.name_orig asc limit ?
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG 
>> (PROFILING): ##### Exit: 604 
>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects etime: 6.079 ms
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE 
>> (PROFILING): ###### retval: 
>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user 
>> 100000),...]
>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] DEBUG 
>> (PROFILING): ##### Exit: 602 
>> ...model.impl.controller.ModelController->searchObjects etime: 21.325 ms
>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] TRACE 
>> (PROFILING): ###### retval: 
>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user 
>> 100000),...]
>>
>> The time needed to get first 20 user records was 21 milliseconds. The 
>> repository processing itself took about 6 milliseconds.
>>
>> The SQL command itself was:
>>
>> select
>>  ruser0_.oid as col_0_0_,
>>  ruser0_2_.fullObject as col_1_0_,
>>  ruser0_2_.stringsCount as col_2_0_,
>>  ruser0_2_.longsCount as col_3_0_,
>>  ruser0_2_.datesCount as col_4_0_,
>>  ruser0_2_.referencesCount as col_5_0_,
>>  ruser0_2_.polysCount as col_6_0_,
>>  ruser0_2_.booleansCount as col_7_0_
>> from
>>  m_user ruser0_
>>  inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
>>  inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
>> order by
>>  ruser0_.name_orig asc
>> limit ?
>>
>> So, yes, the command is the same as in your case.
>>
>> Please, could you do the same profiling as I did? I.e. in your 
>> config.xml file please set up:
>>
>> <configuration>
>>    <midpoint>
>>        ...
>>        <profilingEnabled>true</profilingEnabled>
>>    </midpoint>
>> </configuration>
>>
>> And restart midPoint after the change. Then in the system 
>> configuration (GUI: System -> Profiling) set the profiling:
>>
>> [cid:part1.EDEB0394.06F63FFB at evolveum.com]
>>
>> You can also set logging for org.hibernate.SQL to ALL (or TRACE) to 
>> see SQL commands. And then please try opening the user list and send 
>> us the relevant parts of midpoint.log file (i.e. text between Entry 
>> and Exit of "...model.impl.controller.ModelController->searchObjects").
>>
>> 3.7.2 vs. 3.8 vs. 3.9-SNAPSHOT should make no difference in this 
>> request, as far as I know. And I think that the user objects' size 
>> should not perhaps make such a dramatic difference. I will have a 
>> look at it more, however.
>>
>> Best regards,
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>>
>> On 28.07.2018 16:51, Pavol Mederly wrote:
>> Andrew,
>>
>> now I see I am going to have a look at it anyhow: 
>> https://jira.evolveum.com/browse/MID-4776. I will let you know.
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>> On 28.07.2018 10:39, Pavol Mederly wrote:
>> Hello Andrew,
>>
>> how long does it take to display the user list (in GUI) in your case?
>>
>> I know the query is like you say - and we know it can be optimized - 
>> but in reality the SQL command sent to the database should limit the 
>> results to first 20 records, so it should be faster than processing 
>> SQL query without limits. (Anyway, we can sort this out if needed.)
>>
>> As for the columns, please have a look at "Custom columns 
>> configuration" in 
>> https://wiki.evolveum.com/display/midPoint/Admin+GUI+Configuration.
>>
>> Hope this helps,
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>> On 27.07.2018 18:03, Andrew Morgan wrote:
>> We are running midPoint v3.7.2 (we delayed upgrading to v3.8 because 
>> the Google connector is broken).  We recently loaded about 110,000 
>> users in midPoint.  When I click on Users > List Users, midPoint is 
>> waiting for the results of this database query:
>>
>> MariaDB [midpointdev]> explain select ruser0_.oid as col_0_0_, 
>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_, 
>> ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_, 
>> ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as 
>> col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_ 
>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join 
>> m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by 
>> ruser0_2_.name_orig asc;
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>>
>> | id   | select_type | table     | type   | possible_keys | 
>> key          | key_len | ref                     | rows   | 
>> Extra                                        |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>>
>> |    1 | SIMPLE      | ruser0_   | index  | PRIMARY       | 
>> uc_user_name | 768     | NULL                    | 100839 | Using 
>> index; Using temporary; Using filesort |
>> |    1 | SIMPLE      | ruser0_1_ | eq_ref | PRIMARY       | 
>> PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 | Using 
>> index                                  |
>> |    1 | SIMPLE      | ruser0_2_ | eq_ref | PRIMARY       | 
>> PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 
>> |                                              |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+ 
>>
>>
>> Without the aliases, it is a little easier to read:
>>
>> MariaDB [midpointdev]> explain select m_user.oid, 
>> m_object.fullObject, m_object.stringsCount, m_object.longsCount, 
>> m_object.datesCount, m_object.referencesCount, m_object.polysCount, 
>> m_object.booleansCount from m_user inner join m_focus on 
>> m_user.oid=m_focus.oid inner join m_object on m_user.oid=m_object.oid 
>> order by m_object.name_orig asc;
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+ 
>>
>> | id   | select_type | table    | type   | possible_keys | 
>> key          | key_len | ref                    | rows   | 
>> Extra                                        |
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+ 
>>
>> |    1 | SIMPLE      | m_user   | index  | PRIMARY       | 
>> uc_user_name | 768     | NULL                   | 100839 | Using 
>> index; Using temporary; Using filesort |
>> |    1 | SIMPLE      | m_focus  | eq_ref | PRIMARY       | 
>> PRIMARY      | 110     | midpointdev.m_user.oid |      1 | Using 
>> index                                  |
>> |    1 | SIMPLE      | m_object | eq_ref | PRIMARY       | 
>> PRIMARY      | 110     | midpointdev.m_user.oid |      1 
>> |                                              |
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+ 
>>
>>
>> The result of this query is about 440MB of data.  "111690 rows in set 
>> (22.84 sec)"  (BTW, join on m_focus isn't needed in this query, right?)
>>
>> That is a LOT of data to push around anytime the Users list is 
>> loaded.  We haven't even loaded all of our users yet!  Our total user 
>> population is around 350,000.
>>
>> Is there a way to set some defaults for the User list?  For example, 
>> could I change it to not display anything until a search is 
>> performed?  Ideally, it wouldn't try to fetch m_object.fullObject, at 
>> least until some filter is added.
>>
>> On a somewhat related note, I'd like to change which columns are 
>> displayed in the User list.  For example, I would remove Fullname and 
>> Email and add one of our custom attributes (extension/username).  Is 
>> that possible?
>>
>> Thanks,
>>
>> Andy Morgan
>> Systems Administrator, Identity & Access Management
>> Information Services | Oregon State University
>> 541-737-8877 | is.oregonstate.edu
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180730/3416ad27/attachment.htm>


More information about the midPoint mailing list