Ask your WordPress questions! Pay money and get answers fast! Comodo Trusted Site Seal
Official PayPal Seal

Need a non-expensive query to get popular terms over time period WordPress

  • REFUNDED

I am looking to get the X# most frequently used taxonomy_terms for posts that do not have a parent and are published over the past 2 weeks from VARIABLE_ARRAY taxonomies and have the ability to exclude specific term ids (VARIABLE_ARRAY) from the process as resource-cheaply as possible it doesn't have to follow the above query - that was just my attempt.

--

I have written SQL for a query that does this- but at the cost of server processing. The query pulls the most used taxonomy terms for selected taxonomies that are published, do not have a parent post are not in a comma separated list over the past two weeks -- grouped by the term itself sorted by most posts descending.

The issue I am running into is that it is running very slowly and was wondering if you guys had any suggestions for either converting it to wp_query arguments or the right indexes to put on the tables. Following the optimizers path it is running through 4630 rows on wp_term_taxonomy 1 row on wp_terms, 9 rows on wp_term_relationships and 1 row on wp_posts

The query I wrote is basically:

# ** NOTE THIS IS NOT THE EXACT SQL, WRITTEN TO BETTER ARTICULATE MY QUESTION
SELECT wp_terms.name, wp_terms.slug, wp_term_taxonomy.taxonomy, count(*) WHERE wp.term_taxonomy_id in ('$comma', '$separated', '$taxonomies') AND wp_posts.post_status = 'publish' AND wp_posts.post_parent = 0 AND wp_posts.post_date > '$two_weeks_ago' AND wp_terms.term_id NOT IN ($list, $of, $ids, $comma, $separated) group by wp_terms.term_id order by total_posts desc limit $limit_variable




Hope this makes sense. Thanks.

If I need to add specific indexes on tables that's fine - let me know.

Basically I need the MySQL optimizer to tell me that it is scanning the FEWEST rows possible... ideally less than 100. If this is impossible - then any way to break this down let me know.

Answers (1)

2012-07-27

John Cotton answers:

Try this:


SELECT COUNT(t.term_id) AS total, MAX(t.name) AS name, MAX(t.slug) AS slug, MAX(tt.taxonomy) AS taxonomy
FROM (SELECT id FROM wp_posts WHERE post_status = 'publish' AND post_parent = 0 ) p
INNER JOIN wp_term_relationships tr ON p.id = tr.object_id
INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
INNER JOIN wp_terms t ON t.term_id = tt.term_id

GROUP BY t.term_id
ORDER BY COUNT(t.term_id) DESC


For now, I've removed your date and term_id restrictions as those are easily added back in if you find this effective.


mackrider comments:

That makes it even more expensive!

According to the optimizer the derived post_ids table has to look through 17455 rows!


John Cotton comments:

Bear in mind that the list of post ids for my query will be longer than your list since it doesn't have the date limitation on it.

If your posts table has 18K rows, then any query is going to have to look through them all to pull out the ones you need.

I've just ran that query on a WP install with 22K posts and 18K terms and it took < 0.1 seconds. No extra indexes.


mackrider comments:

What about the explain query rows?


John Cotton comments:

<blockquote>What about the explain query rows?</blockquote>
Do you mean "why is it less than the actual number of rows in the table"?

Because the indexes that WP creates mean that when scanning rows it can skip some but you'd need to ask a SQL expert about how it does that.

You should find that WP has already created indexes on the fields you're interested in.


mackrider comments:

No, no.. what I mean is in relation to what you said "I've just ran that query on a WP install with 22K posts and 18K terms and it took < 0.1 seconds. No extra indexes."

Run that query with EXPLAIN in front and how many rows does the optimizer say it scans?

Thanks.


John Cotton comments:

In wp_posts? 9572


John Cotton comments:

And it's using post_parent as it's key.


John Cotton comments:

And it's using post_parent as it's key.