WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 3 years ago

#37038 closed enhancement (fixed)

WP_Tax_Query needs caching

Reported by: spacedmonkey Owned by: boonebgorges
Milestone: 4.9 Priority: normal
Severity: normal Version: 3.1
Component: Query Keywords: has-patch
Focuses: performance Cc:

Description

The WP_Tax_Query is a class used to extend the WP_query class (and others) to query the terms and return SQL statement to join object tables. However how it gets the term taxonomy id, performs raw sqls queries that are not cached.

There are number of ways in which caching could be added to method. It could use get_terms to get the wp_term objects and pluck out the require attribute. Or a new cache could be added.

Attachments (6)

37038.patch (2.9 KB) - added by spacedmonkey 4 years ago.
37038.2.patch (5.6 KB) - added by spacedmonkey 4 years ago.
37038.3.patch (3.2 KB) - added by spacedmonkey 4 years ago.
37038.4.patch (5.5 KB) - added by spacedmonkey 3 years ago.
37038.5.patch (4.2 KB) - added by spacedmonkey 3 years ago.
37038.6.patch (4.4 KB) - added by boonebgorges 3 years ago.

Download all attachments as: .zip

Change History (18)

@spacedmonkey
4 years ago

#1 @spacedmonkey
4 years ago

  • Keywords has-patch needs-unit-tests added

Added a first patch. This adds a new cache to this method. This save using get_terms, which would add an processing time overhead.

#2 @spacedmonkey
4 years ago

This new patch, goes a new redirection and removes all sql and uses the get_terms function. This will add overhead to the tax query, but also add caching and improve filterability.

#3 @boonebgorges
4 years ago

  • Keywords needs-patch added; has-patch removed
  • Milestone changed from Awaiting Review to Future Release

I added term_taxonomy_id support to WP_Term_Query in [37683]. See #37074.

I opted *not* to include fields support for term_taxonomy_id. I'm generally against fine-grained SELECT x parameters in our query classes, because (appropriately for this ticket) they mess with our caching strategy. In the case of term queries, it doesn't currently matter because we (wastefully) keep separate caches for each value of fields. But when we move to caching only term IDs and pulling additional data from the cache, the presence of 'fields' is going to mess things up. See #34239. For the purposes of this ticket, this means fetching the entire term objects and plucking out the values that are needed.

Implementation question: Why the sanitize_term_field() juggling for name? This kind of sanitization is already handled in WP_Term_Query, isn't it?

Worth noting that this ticket is a duplicate of a number of others. A quick search brings up #16621 and #14983, though there are probably others. The strategy in the current ticket seems like the right one to me, but I want to be sure we go through and reference relevant tickets as necessary.

#4 @spacedmonkey
4 years ago

I updated the patch after the commit in 37683. This patch now get the whole wp_term object and pluck out the field it needs. It is nicer way of doing it and it primes the WP_Term object in cache / memory. It is likely that the object will be used down the line.

I think we should use this ticket to continue the work on this caching and close the others. Sorry I didn't spot those tickets before :(

This ticket was mentioned in Slack in #core-multisite by spacedmonkey. View the logs.


4 years ago

#6 @spacedmonkey
3 years ago

  • Keywords needs-refresh added

This is another good ticket for 4.7. The patch will need updated I think as @boonebgorges merged the taxonomy id code in another commit.

#7 @spacedmonkey
3 years ago

  • Keywords has-patch added; needs-patch removed

#8 @spacedmonkey
3 years ago

37038.5.patch core tests now pass.

Need feedback on for unit tests.

#9 @boonebgorges
3 years ago

  • Keywords needs-unit-tests needs-refresh removed
  • Milestone changed from Future Release to 4.9

Thanks for the updated patch, @spacedmonkey. I think we are close on this one. 37038.6.patch makes the following changes to 5.patch:

  • I've removed the $resulting_fields -> $args['fields'] switch statement. This means that all calls to WP_Term_Query will fetch fields=all. This seems better to me for a couple reasons. First, it primes the cache for future use (as you note in your comment above). Second, it means we can simply wp_list_pluck() at the end of the method in all cases. Third, it helps us move toward a future where the 'fields' parameter in WP_Term_Query doesn't result in fragmented caches.
  • On 576 of 37038.5.patch, you added an array_filter(). It looks like this was meant to break some broken tests that resulted from some edge cases related to empty values being passed to the terms parameter of tax_query clauses. Your fix worked, but it felt opaque to me, so I put a more explicit fix into transform_query() (bailing if 'terms' is empty), which is where the problem really lives anyway.
  • I made a few changes to tests that expected to get string IDs instead of true integers - eg '123' instead of 123. This is technically a compatibility break, which we should take a note of and try to document in a dev note for 4.9, but it's highly unlikely that anyone is doing a strict string-check on the result of a transform_query() transformation.
  • 'suppress_filters' is a parameter for get_terms(), but since we're using WP_Term_Query directly, we don't need it.

I've reviewed the existing transform_query() tests and they look fairly comprehensive, so I'm not sure that we need new ones.

@spacedmonkey Mind having a look through the updated patch to see if anything jumps out at you?

#10 @spacedmonkey
3 years ago

  • I like that the tax query using fields=>'all'. I think it adds a little overhead, but it always primes term caches. I feel like it highly likely that you would using a post's terms in a loop. Like displaying a post categories. That could be a performance win down stream.
  • Allow filters is great, it adds an array of filters in play for tax query, which currently has no hooks in it. A massive benefit.
  • Might be worth testing this patch alongside #37189 as they are somewhat related.
  • On line 579 (in class-wp-tax-query.php) hierarchical taxonomies query by term_id. Correct me if I am but if term splitting is complete, then tt_id and term_id should be the same. Could we do something like this.
    $field = (get_option('finished_splitting_shared_terms', false) ) ? 'term_taxonomy_id' : 'term_id';
    $this->transform_query( $query, $field );
    

This may save another query or hit existing caches.

Otherwise I think we are there.

#11 @boonebgorges
3 years ago

@spacedmonkey Thanks for the review.

On line 579 (in class-wp-tax-query.php) hierarchical taxonomies query by term_id. Correct me if I am but if term splitting is complete, then tt_id and term_id should be the same.

No: term_id and tt_id are only guaranteed to be correct if there were *never* any shared terms. If terms were previously shared and were then split, they won't match. (wp_terms and wp_term_taxonomy will have the same number of rows in these cases, but the items will be in a different order.)

#12 @boonebgorges
3 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from new to closed

In 40918:

Use WP_Term_Query when transforming tax queries.

This change allows tax query transformations to be cached.

Props spacedmonkey.
Fixes #37038.

Note: See TracTickets for help on using tickets.