[midPoint] Users list is slow with large numbers of users

Andrew Morgan morgan at oregonstate.edu
Tue Jul 31 19:19:36 CEST 2018


Wow, thanks for the quick work.  I'm not familiar with the support-3.7 
branch.  Is there a way for me to get this?  Will there be a release 
(3.7.3?) of this?

Thanks,

Andy Morgan
Systems Administrator, Identity & Access Management
Information Services | Oregon State University
541-737-8877 | is.oregonstate.edu

On Tue, 31 Jul 2018, Pavol Mederly wrote:

> Hello,
>
> the fix is implemented on support-3.7 branch. Please do not forget to apply DB migration scripts.
>
> Best regards,
>
> Pavol Mederly
> Software developer
> evolveum.com
>
>
> On 30.07.2018 19:24, Pavol Mederly wrote:
>
> Andrew,
>
> I think I got it. It seems to be a difference between 3.7 and 3.8, indeed.
>
> The query in 3.7 is like this:
>
> select
>  ruser0_.oid as col_0_0_,
>  ruser0_2_.fullObject as col_1_0_,
>  ruser0_2_.stringsCount as col_2_0_,
>  ruser0_2_.longsCount as col_3_0_,
>  ruser0_2_.datesCount as col_4_0_,
>  ruser0_2_.referencesCount as col_5_0_,
>  ruser0_2_.polysCount as col_6_0_,
>  ruser0_2_.booleansCount as col_7_0_
> from
>  m_user ruser0_
>  inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
>  inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
> order by
>  ruser0_2_.name_orig asc
> limit ?
>
> and in 3.8 it is:
>
> select
>  ruser0_.oid as col_0_0_,
>  ruser0_2_.fullObject as col_1_0_,
>  ruser0_2_.stringsCount as col_2_0_,
>  ruser0_2_.longsCount as col_3_0_,
>  ruser0_2_.datesCount as col_4_0_,
>  ruser0_2_.referencesCount as col_5_0_,
>  ruser0_2_.polysCount as col_6_0_,
>  ruser0_2_.booleansCount as col_7_0_
> from
>  m_user ruser0_
>  inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
>  inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
> order by
>  ruser0_.name_orig asc
> limit 20
>
> Note the difference. The 3.7 version orders on a row in doubly joined table, whereas the 3.8 version orders on a row in the "main" table. When executing against my test database (10K rows, each having ~100 KB of XML data), the former takes 25 seconds, whereas the latter takes 46 milliseconds.
>
> Having seen this I remember that our colleague Viliam really implemented querying over "local" name column in 3.8, instead of the topmost one. :) Maybe this was the reason. Anyway, tomorrow I will try to find the least painful way how to bring this change into 3.6/3.7; and will let you know.
>
> Best regards,
>
> Pavol Mederly
> Software developer
> evolveum.com
>
>
> On 30.07.2018 17:58, Andrew Morgan wrote:
> Pavol,
>
> I have attached the section of midpoint.log as requested.  As you can see, it takes about 20 seconds to display the results in the GUI.
>
> When I run the same query in the mysql client, is says, "20 rows in set (13.93 sec)".
>
> Here is the EXPLAIN for it again:
>
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
> | id   | select_type | table     | type   | possible_keys | key          | key_len | ref                     | rows   | Extra                                        |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
> |    1 | SIMPLE      | ruser0_   | index  | PRIMARY       | uc_user_name | 768     | NULL                    | 100970 | Using index; Using temporary; Using filesort |
> |    1 | SIMPLE      | ruser0_1_ | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 | Using index                                  |
> |    1 | SIMPLE      | ruser0_2_ | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 |                                              |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>
> Our database server is CentOS 7.5 running MariaDB-server-10.2.16-1.el7.centos.x86_64.  The VM has 4GB of RAM and 2 CPUs (Xeon Gold 5118 CPU @ 2.30GHz).  We haven't done much tuning to it yet:
>
> [mysqld]
> innodb_buffer_pool_size = 2G
> max_allowed_packet = 256M
>
>
> Thanks for helping us investigate!
>
> Andy Morgan
> Systems Administrator, Identity & Access Management
> Information Services | Oregon State University
> 541-737-8877 | is.oregonstate.edu
>
> On Mon, 30 Jul 2018, Pavol Mederly wrote:
>
> Andrew, Petr,
>
> my observations on my developer machine running midPoint (master) and MySQL 5.7 locally are here:
>
> I have imported 100K users (almost empty objects, only a name, nothing more). After listing them via User List page the response is almost immediate. Profiling says:
>
> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] DEBUG (PROFILING): #### Entry: 602 ...model.impl.controller.ModelController->searchObjects
> 2018-07-30 16:56:08,973 [MODEL] [http-nio-8080-exec-9] TRACE (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M: 20,ORD: [name ASCENDING], ,, [ObjectOperationOptions(/:distinct=true)], Task(id:1532962568973-0-1, name:null, oid:null), R(com.evolveum.midpoint.web.component.data.SelectableBeanObjectDataProvider.searchObjects UNKNOWN null))
> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG (PROFILING): #### Entry: 604   ...repo.sql.SqlRepositoryServiceImpl->searchObjects
> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE (PROFILING): ###### args: (UserType, Q{null filterPAGING: O: 0,M: 20,ORD: [name ASCENDING], ,, [ObjectOperationOptions(/:distinct=true)], R(com.evolveum.midpoint.model.api.ModelService.searchObjects UNKNOWN null))
> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG (org.hibernate.SQL): select ruser0_.oid as col_0_0_, ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_, ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_, ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_ inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by ruser0_.name_orig asc limit ?
> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] DEBUG (PROFILING): ##### Exit: 604    ...repo.sql.SqlRepositoryServiceImpl->searchObjects etime: 6.079 ms
> 2018-07-30 16:56:08,973 [REPOSITORY] [http-nio-8080-exec-9] TRACE (PROFILING): ###### retval: [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user 100000),...]
> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] DEBUG (PROFILING): ##### Exit: 602  ...model.impl.controller.ModelController->searchObjects etime: 21.325 ms
> 2018-07-30 16:56:08,993 [MODEL] [http-nio-8080-exec-9] TRACE (PROFILING): ###### retval: [user:00000000-0000-0000-0000-000000000002(administrator),user:63e9574b-6fa8-4f72-ab47-d7bc4a9b81b6(user 100000),...]
>
> The time needed to get first 20 user records was 21 milliseconds. The repository processing itself took about 6 milliseconds.
>
> The SQL command itself was:
>
> select
> ruser0_.oid as col_0_0_,
> ruser0_2_.fullObject as col_1_0_,
> ruser0_2_.stringsCount as col_2_0_,
> ruser0_2_.longsCount as col_3_0_,
> ruser0_2_.datesCount as col_4_0_,
> ruser0_2_.referencesCount as col_5_0_,
> ruser0_2_.polysCount as col_6_0_,
> ruser0_2_.booleansCount as col_7_0_
> from
> m_user ruser0_
> inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid
> inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid
> order by
> ruser0_.name_orig asc
> limit ?
>
> So, yes, the command is the same as in your case.
>
> Please, could you do the same profiling as I did? I.e. in your config.xml file please set up:
>
> <configuration>
>   <midpoint>
>       ...
>       <profilingEnabled>true</profilingEnabled>
>   </midpoint>
> </configuration>
>
> And restart midPoint after the change. Then in the system configuration (GUI: System -> Profiling) set the profiling:
>
> [cid:part1.EDEB0394.06F63FFB at evolveum.com]
>
> You can also set logging for org.hibernate.SQL to ALL (or TRACE) to see SQL commands. And then please try opening the user list and send us the relevant parts of midpoint.log file (i.e. text between Entry and Exit of "...model.impl.controller.ModelController->searchObjects").
>
> 3.7.2 vs. 3.8 vs. 3.9-SNAPSHOT should make no difference in this request, as far as I know. And I think that the user objects' size should not perhaps make such a dramatic difference. I will have a look at it more, however.
>
> Best regards,
>
> Pavol Mederly
> Software developer
> evolveum.com
>
>
> On 28.07.2018 16:51, Pavol Mederly wrote:
> Andrew,
>
> now I see I am going to have a look at it anyhow: https://jira.evolveum.com/browse/MID-4776. I will let you know.
>
> Pavol Mederly
> Software developer
> evolveum.com
>
> On 28.07.2018 10:39, Pavol Mederly wrote:
> Hello Andrew,
>
> how long does it take to display the user list (in GUI) in your case?
>
> I know the query is like you say - and we know it can be optimized - but in reality the SQL command sent to the database should limit the results to first 20 records, so it should be faster than processing SQL query without limits. (Anyway, we can sort this out if needed.)
>
> As for the columns, please have a look at "Custom columns configuration" in https://wiki.evolveum.com/display/midPoint/Admin+GUI+Configuration.
>
> Hope this helps,
>
> Pavol Mederly
> Software developer
> evolveum.com
>
> On 27.07.2018 18:03, Andrew Morgan wrote:
> We are running midPoint v3.7.2 (we delayed upgrading to v3.8 because the Google connector is broken).  We recently loaded about 110,000 users in midPoint.  When I click on Users > List Users, midPoint is waiting for the results of this database query:
>
> MariaDB [midpointdev]> explain select ruser0_.oid as col_0_0_, ruser0_2_.fullObject as col_1_0_, ruser0_2_.stringsCount as col_2_0_, ruser0_2_.longsCount as col_3_0_, ruser0_2_.datesCount as col_4_0_, ruser0_2_.referencesCount as col_5_0_, ruser0_2_.polysCount as col_6_0_, ruser0_2_.booleansCount as col_7_0_ from m_user ruser0_ inner join m_focus ruser0_1_ on ruser0_.oid=ruser0_1_.oid inner join m_object ruser0_2_ on ruser0_.oid=ruser0_2_.oid order by ruser0_2_.name_orig asc;
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
> | id   | select_type | table     | type   | possible_keys | key          | key_len | ref                     | rows   | Extra                                        |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
> |    1 | SIMPLE      | ruser0_   | index  | PRIMARY       | uc_user_name | 768     | NULL                    | 100839 | Using index; Using temporary; Using filesort |
> |    1 | SIMPLE      | ruser0_1_ | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 | Using index                                  |
> |    1 | SIMPLE      | ruser0_2_ | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.ruser0_.oid |      1 |                                              |
> +------+-------------+-----------+--------+---------------+--------------+---------+-------------------------+--------+----------------------------------------------+
>
> Without the aliases, it is a little easier to read:
>
> MariaDB [midpointdev]> explain select m_user.oid, m_object.fullObject, m_object.stringsCount, m_object.longsCount, m_object.datesCount, m_object.referencesCount, m_object.polysCount, m_object.booleansCount from m_user inner join m_focus on m_user.oid=m_focus.oid inner join m_object on m_user.oid=m_object.oid order by m_object.name_orig asc;
> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
> | id   | select_type | table    | type   | possible_keys | key          | key_len | ref                    | rows   | Extra                                        |
> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
> |    1 | SIMPLE      | m_user   | index  | PRIMARY       | uc_user_name | 768     | NULL                   | 100839 | Using index; Using temporary; Using filesort |
> |    1 | SIMPLE      | m_focus  | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.m_user.oid |      1 | Using index                                  |
> |    1 | SIMPLE      | m_object | eq_ref | PRIMARY       | PRIMARY      | 110     | midpointdev.m_user.oid |      1 |                                              |
> +------+-------------+----------+--------+---------------+--------------+---------+------------------------+--------+----------------------------------------------+
>
> The result of this query is about 440MB of data.  "111690 rows in set (22.84 sec)"  (BTW, join on m_focus isn't needed in this query, right?)
>
> That is a LOT of data to push around anytime the Users list is loaded.  We haven't even loaded all of our users yet!  Our total user population is around 350,000.
>
> Is there a way to set some defaults for the User list?  For example, could I change it to not display anything until a search is performed?  Ideally, it wouldn't try to fetch m_object.fullObject, at least until some filter is added.
>
> On a somewhat related note, I'd like to change which columns are displayed in the User list.  For example, I would remove Fullname and Email and add one of our custom attributes (extension/username).  Is that possible?
>
> Thanks,
>
> Andy Morgan
> Systems Administrator, Identity & Access Management
> Information Services | Oregon State University
> 541-737-8877 | is.oregonstate.edu
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com><mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
>
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
>
>
>
>
> _______________________________________________
> midPoint mailing list
> midPoint at lists.evolveum.com<mailto:midPoint at lists.evolveum.com>
> http://lists.evolveum.com/mailman/listinfo/midpoint
>
>
>



More information about the midPoint mailing list