Even with a moderately small data set - 26 judges, 8 categories, 10 regions, 706 nominations - we end up with an intermediate recordset of over 1.4m records, which we are running 4 DISTINCT counts against. This can take upwards of 10 seconds to run the query:
select
count( distinct judges.id ) as judge_count,
count( distinct categories.id ) as category_count,
count( distinct regions.id ) as region_count,
count( distinct nominations.id ) as nomination_count
from
pobj_award_set award_set
left join pobj_award_judge judges on (judges.award_set = award_set.id)
left join pobj_award_category categories on (categories.award_set = award_set.id)
left join pobj_award_region regions on (regions.award_set = award_set.id)
left join pobj_award_nomination nominations on (nominations.award_set = award_set.id)
where
award_set.id = '3FC2C389-77E6-4DBF-A8D742267334141F';
By refactoring this to join to 4 individual grouped count tables, the same results can be returned in 30ms!
select
coalesce( _judge_counts.judge_count , 0 ) as judge_count,
coalesce( _category_counts.category_count , 0 ) as category_count,
coalesce( _region_counts.region_count , 0 ) as region_count,
coalesce( _nomination_counts.nomination_count, 0 ) as nomination_count
from
pobj_award_set award_set
left join ( select award_set, count( id ) as judge_count from pobj_award_judge group by award_set ) _judge_counts ON _judge_counts.award_set = award_set.id
left join ( select award_set, count( id ) as category_count from pobj_award_category group by award_set ) _category_counts ON _category_counts.award_set = award_set.id
left join ( select award_set, count( id ) as region_count from pobj_award_region group by award_set ) _region_counts ON _region_counts.award_set = award_set.id
left join ( select award_set, count( id ) as nomination_count from pobj_award_nomination group by award_set ) _nomination_counts ON _nomination_counts.award_set = award_set.id
where
award_set.id = '3FC2C389-77E6-4DBF-A8D742267334141F';
By specifying a "special" format for the formula field; for example:
...where nominations is an x-to-many relationship on the same object, we could create a special case select field and join SQL to produce the more performant SQL.
The same approach could then be taken on other commonly-used formula patterns, if doing so would give a performance increase...
The common pattern of this formula to get a count of x-to-many relationships can have an exponentially detrimental effect on performance.
Take this, for example:
property name="category_count" adminRenderer="none" type="numeric" dbtype="int" formula="count( distinct ${prefix}categories.id )"; property name="region_count" adminRenderer="none" type="numeric" dbtype="int" formula="count( distinct ${prefix}regions.id )"; property name="nomination_count" adminRenderer="none" type="numeric" dbtype="int" formula="count( distinct ${prefix}nominations.id )"; property name="judge_count" adminRenderer="none" type="numeric" dbtype="int" formula="count( distinct ${prefix}judges.id )";
Even with a moderately small data set - 26 judges, 8 categories, 10 regions, 706 nominations - we end up with an intermediate recordset of over 1.4m records, which we are running 4 DISTINCT counts against. This can take upwards of 10 seconds to run the query:
select count( distinct judges.id ) as judge_count, count( distinct categories.id ) as category_count, count( distinct regions.id ) as region_count, count( distinct nominations.id ) as nomination_count from pobj_award_set award_set left join pobj_award_judge judges on (judges.award_set = award_set.id) left join pobj_award_category categories on (categories.award_set = award_set.id) left join pobj_award_region regions on (regions.award_set = award_set.id) left join pobj_award_nomination nominations on (nominations.award_set = award_set.id) where award_set.id = '3FC2C389-77E6-4DBF-A8D742267334141F';
By refactoring this to join to 4 individual grouped count tables, the same results can be returned in 30ms!
select coalesce( _judge_counts.judge_count , 0 ) as judge_count, coalesce( _category_counts.category_count , 0 ) as category_count, coalesce( _region_counts.region_count , 0 ) as region_count, coalesce( _nomination_counts.nomination_count, 0 ) as nomination_count from pobj_award_set award_set left join ( select award_set, count( id ) as judge_count from pobj_award_judge group by award_set ) _judge_counts ON _judge_counts.award_set = award_set.id left join ( select award_set, count( id ) as category_count from pobj_award_category group by award_set ) _category_counts ON _category_counts.award_set = award_set.id left join ( select award_set, count( id ) as region_count from pobj_award_region group by award_set ) _region_counts ON _region_counts.award_set = award_set.id left join ( select award_set, count( id ) as nomination_count from pobj_award_nomination group by award_set ) _nomination_counts ON _nomination_counts.award_set = award_set.id where award_set.id = '3FC2C389-77E6-4DBF-A8D742267334141F';
By specifying a "special" format for the formula field; for example:
property name="nomination_count" formula="count{nominations}";
...where
nominations
is an x-to-many relationship on the same object, we could create a special case select field and join SQL to produce the more performant SQL.The same approach could then be taken on other commonly-used formula patterns, if doing so would give a performance increase...