[midPoint] Users list is slow with large numbers of users
Pavol Mederly
mederly at evolveum.com
Mon Jul 30 19:24:18 CEST 2018
Andrew,
I think I got it. It seems to be a difference between 3.7 and 3.8, indeed.
The query in 3.7 is like this:
select
ruser0_.oid as col_0_0_,
ruser0_2_.fullObject as col_1_0_,
ruser0_2_.stringsCount as col_2_0_,
ruser0_2_.longsCount as col_3_0_,
ruser0_2_.datesCount as col_4_0_,
ruser0_2_.referencesCount as col_5_0_,
ruser0_2_.polysCount as col_6_0_,
ruser0_2_.booleansCount as col_7_0_
from
m_user ruser0_
inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
*order by **
** ruser0_2_.name_orig asc **
*limit ?
and in 3.8 it is:
select
ruser0_.oid as col_0_0_,
ruser0_2_.fullObject as col_1_0_,
ruser0_2_.stringsCount as col_2_0_,
ruser0_2_.longsCount as col_3_0_,
ruser0_2_.datesCount as col_4_0_,
ruser0_2_.referencesCount as col_5_0_,
ruser0_2_.polysCount as col_6_0_,
ruser0_2_.booleansCount as col_7_0_
from
m_user ruser0_
inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
*order by**
** ruser0_.name_orig asc**
*limit 20
Note the difference. The 3.7 version orders on a row in doubly joined
table, whereas the 3.8 version orders on a row in the "main" table. When
executing against my test database (10K rows, each having ~100 KB of XML
data), the former takes 25 seconds, whereas the latter takes 46
milliseconds.
Having seen this I remember that our colleague Viliam really implemented
querying over "local" name column in 3.8, instead of the topmost one. :)
Maybe this was the reason. Anyway, tomorrow I will try to find the least
painful way how to bring this change into 3.6/3.7; and will let you know.
Best regards,
Pavol Mederly
Software developer
evolveum.com
On 30.07.2018 17:58, Andrew Morgan wrote:
> Pavol,
>
> I have attached the section of midpoint.log as requested. As you can
> see, it takes about 20 seconds to display the results in the GUI.
>
> When I run the same query in the mysql client, is says, "20 rows in
> set (13.93 sec)".
>
> Here is the EXPLAIN for it again:
>
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>
> | id | select_type | table | type | possible_keys |
> key | key_len | ref | rows |
> Extra |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>
> | 1 | SIMPLE | ruser0_ | index | PRIMARY |
> uc_user_name | 768 | NULL | 100970 | Using
> index; Using temporary; Using filesort |
> | 1 | SIMPLE | ruser0_1_ | eq_ref | PRIMARY |
> PRIMARY | 110 | midpointdev.ruser0_.oid | 1 | Using
> index |
> | 1 | SIMPLE | ruser0_2_ | eq_ref | PRIMARY |
> PRIMARY | 110 | midpointdev.ruser0_.oid | 1
> | |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>
>
> Our database server is CentOS 7.5 running
> MariaDB-server-10.2.16-1.el7.centos.x86_64. The VM has 4GB of RAM and
> 2 CPUs (Xeon Gold 5118 CPU @ 2.30GHz). We haven't done much tuning to
> it yet:
>
> [mysqld]
> innodb_buffer_pool_size = 2G
> max_allowed_packet = 256M
>
>
> Thanks for helping us investigate!
>
> Andy Morgan
> Systems Administrator, Identity & Access Management
> Information Services | Oregon State University
> 541-737-8877 | is.oregonstate.edu
>
> On Mon, 30 Jul 2018, Pavol Mederly wrote:
>
>> Andrew, Petr,
>>
>> my observations on my developer machine running midPoint (master) and
>> MySQL 5.7 locally are here:
>>
>> I have imported 100K users (almost empty objects, only a name,
>> nothing more). After listing them via User List page the response is
>> almost immediate. Profiling says:
>>
>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] DEBUG
>> (PROFILING): #### Entry: 602
>> ...model.impl.controller.ModelController->searchObjects
>> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] TRACE
>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M:
>> 20,ORD: [name ASCENDING], ,,
>> [ObjectOperationOptions(/:distinct=true)], Task(id:1532962568973-0-1,
>> name:null, oid:null),
>> R(com.evolveum.midpoint.web.component.data.SelectableBeanObjectDataProvider.searchObjects
>> UNKNOWN null))
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>> (PROFILING): #### Entry: 604
>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE
>> (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M:
>> 20,ORD: [name ASCENDING], ,,
>> [ObjectOperationOptions(/:distinct=true)],
>> R(com.evolveum.midpoint.model.api.ModelService.searchObjects UNKNOWN
>> null))
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>> (org.hibernate.SQL): select ruser0_.oid as col_0_0_,
>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_,
>> ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_,
>> ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as
>> col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_
>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join
>> m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by
>> ruser0_.name_orig asc limit ?
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG
>> (PROFILING): ##### Exit: 604
>> ...repo.sql.SqlRepositoryServiceImpl->searchObjects etime: 6.079 ms
>> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE
>> (PROFILING): ###### retval:
>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user
>> 100000),...]
>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] DEBUG
>> (PROFILING): ##### Exit: 602
>> ...model.impl.controller.ModelController->searchObjects etime: 21.325 ms
>> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] TRACE
>> (PROFILING): ###### retval:
>> [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user
>> 100000),...]
>>
>> The time needed to get first 20 user records was 21 milliseconds. The
>> repository processing itself took about 6 milliseconds.
>>
>> The SQL command itself was:
>>
>> select
>> ruser0_.oid as col_0_0_,
>> ruser0_2_.fullObject as col_1_0_,
>> ruser0_2_.stringsCount as col_2_0_,
>> ruser0_2_.longsCount as col_3_0_,
>> ruser0_2_.datesCount as col_4_0_,
>> ruser0_2_.referencesCount as col_5_0_,
>> ruser0_2_.polysCount as col_6_0_,
>> ruser0_2_.booleansCount as col_7_0_
>> from
>> m_user ruser0_
>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
>> inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
>> order by
>> ruser0_.name_orig asc
>> limit ?
>>
>> So, yes, the command is the same as in your case.
>>
>> Please, could you do the same profiling as I did? I.e. in your
>> config.xml file please set up:
>>
>> <configuration>
>> <midpoint>
>> ...
>> <profilingEnabled>true</profilingEnabled>
>> </midpoint>
>> </configuration>
>>
>> And restart midPoint after the change. Then in the system
>> configuration (GUI: System -> Profiling) set the profiling:
>>
>> [cid:part1.EDEB0394.06F63FFB at evolveum.com]
>>
>> You can also set logging for org.hibernate.SQL to ALL (or TRACE) to
>> see SQL commands. And then please try opening the user list and send
>> us the relevant parts of midpoint.log file (i.e. text between Entry
>> and Exit of "...model.impl.controller.ModelController->searchObjects").
>>
>> 3.7.2 vs. 3.8 vs. 3.9-SNAPSHOT should make no difference in this
>> request, as far as I know. And I think that the user objects' size
>> should not perhaps make such a dramatic difference. I will have a
>> look at it more, however.
>>
>> Best regards,
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>>
>> On 28.07.2018 16:51, Pavol Mederly wrote:
>> Andrew,
>>
>> now I see I am going to have a look at it anyhow:
>> https://jira.evolveum.com/browse/MID-4776. I will let you know.
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>> On 28.07.2018 10:39, Pavol Mederly wrote:
>> Hello Andrew,
>>
>> how long does it take to display the user list (in GUI) in your case?
>>
>> I know the query is like you say - and we know it can be optimized -
>> but in reality the SQL command sent to the database should limit the
>> results to first 20 records, so it should be faster than processing
>> SQL query without limits. (Anyway, we can sort this out if needed.)
>>
>> As for the columns, please have a look at "Custom columns
>> configuration" in
>> https://wiki.evolveum.com/display/midPoint/Admin+GUI+Configuration.
>>
>> Hope this helps,
>>
>> Pavol Mederly
>> Software developer
>> evolveum.com
>>
>> On 27.07.2018 18:03, Andrew Morgan wrote:
>> We are running midPoint v3.7.2 (we delayed upgrading to v3.8 because
>> the Google connector is broken). We recently loaded about 110,000
>> users in midPoint. When I click on Users > List Users, midPoint is
>> waiting for the results of this database query:
>>
>> MariaDB [midpointdev]> explain select ruser0_.oid as col_0_0_,
>> ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_,
>> ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_,
>> ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as
>> col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_
>> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join
>> m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by
>> ruser0_2_.name_orig asc;
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>> | id | select_type | table | type | possible_keys |
>> key | key_len | ref | rows |
>> Extra |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>> | 1 | SIMPLE | ruser0_ | index | PRIMARY |
>> uc_user_name | 768 | NULL | 100839 | Using
>> index; Using temporary; Using filesort |
>> | 1 | SIMPLE | ruser0_1_ | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1 | Using
>> index |
>> | 1 | SIMPLE | ruser0_2_ | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.ruser0_.oid | 1
>> | |
>> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>>
>>
>> Without the aliases, it is a little easier to read:
>>
>> MariaDB [midpointdev]> explain select m_user.oid,
>> m_object.fullObject, m_object.stringsCount, m_object.longsCount,
>> m_object.datesCount, m_object.referencesCount, m_object.polysCount,
>> m_object.booleansCount from m_user inner join m_focus on
>> m_user.oid=m_focus.oid inner join m_object on m_user.oid=m_object.oid
>> order by m_object.name_orig asc;
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>
>> | id | select_type | table | type | possible_keys |
>> key | key_len | ref | rows |
>> Extra |
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>
>> | 1 | SIMPLE | m_user | index | PRIMARY |
>> uc_user_name | 768 | NULL | 100839 | Using
>> index; Using temporary; Using filesort |
>> | 1 | SIMPLE | m_focus | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.m_user.oid | 1 | Using
>> index |
>> | 1 | SIMPLE | m_object | eq_ref | PRIMARY |
>> PRIMARY | 110 | midpointdev.m_user.oid | 1
>> | |
>> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>>
>>
>> The result of this query is about 440MB of data. "111690 rows in set
>> (22.84 sec)" (BTW, join on m_focus isn't needed in this query, right?)
>>
>> That is a LOT of data to push around anytime the Users list is
>> loaded. We haven't even loaded all of our users yet! Our total user
>> population is around 350,000.
>>
>> Is there a way to set some defaults for the User list? For example,
>> could I change it to not display anything until a search is
>> performed? Ideally, it wouldn't try to fetch m_object.fullObject, at
>> least until some filter is added.
>>
>> On a somewhat related note, I'd like to change which columns are
>> displayed in the User list. For example, I would remove Fullname and
>> Email and add one of our custom attributes (extension/username). Is
>> that possible?
>>
>> Thanks,
>>
>> Andy Morgan
>> Systems Administrator, Identity & Access Management
>> Information Services | Oregon State University
>> 541-737-8877 | is.oregonstate.edu
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>> _______________________________________________
>> midPoint mailing list
>> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
>> http://lists.evolveum.com/mailman/listinfo/midpoint
>>
>>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com
> http://lists.evolveum.com/mailman/listinfo/midpoint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.evolveum.com/pipermail/midpoint/attachments/20180730/3416ad27/attachment.htm>
More information about the midPoint
mailing list