[wp-trac] [WordPress Trac] #56598: Performance optimization in _update_post_term_count()
WordPress Trac
noreply at wordpress.org
Fri Sep 1 18:24:05 UTC 2023
#56598: Performance optimization in _update_post_term_count()
-------------------------------------------+--------------------------
Reporter: dimitrisv | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: 6.4
Component: Taxonomy | Version: 6.0.2
Severity: normal | Resolution:
Keywords: needs-patch reporter-feedback | Focuses: performance
-------------------------------------------+--------------------------
Comment (by dimitrisv):
Hello,
I am sorry for the delay - my apologies.
When I started this thread I was dealing with a WP installation of a news
site, with over 70million rows in their post-meta table. Since then I have
delivered the project I have only a backup of a "mere" 20million rows in
postmeta, 350k in posts.
It did make a lot of difference especially, given that the default indexes
were inadequate. Theoretically speaking it is the right thing to do.
Here is what chat GPT said about the comment by @peterwilsoncc
{{{
Based on your description, it seems you're discussing a database-related
query optimization topic.
In SQL, `count(*)` and `count(field)` are two different ways to count rows
based on certain criteria:
1. `count(*)`: This counts all rows irrespective of whether they have null
values or not. It's used when you just want to know the number of rows in
a result set.
2. `count(field)`: This counts only the rows where the specified `field`
has a non-null value.
Factors that might influence the cost between `count(*)` and
`count(field)`:
1. **Table Structure**: If a table doesn't have any nullable columns, then
`count(*)` and `count(any_field)` would theoretically have similar costs.
2. **Indexes**: If a field/column you're using in `count(field)` is
indexed, the database might retrieve the count more efficiently compared
to counting all rows with `count(*)`.
3. **Database Optimization**: Modern databases might have optimizations in
place for certain operations. For instance, some databases might have a
quick way to determine the row count for a table without scanning all
rows.
4. **Storage Engine**: Some database storage engines might have
optimizations that make one type of count faster than the other.
How to analyze the costs:
For example, in MySQL, you can prefix your query with `EXPLAIN` to see an
execution plan:
```sql
EXPLAIN SELECT count(*) FROM your_table WHERE some_condition;
EXPLAIN SELECT count(field) FROM your_table WHERE some_condition;
```
...
It's worth noting that the actual difference in execution cost between
`count(*)` and `count(field)` may vary depending on the specific database,
the data distribution, and other factors. So it's always a good idea to
test in your own environment and use tools like EXPLAIN to understand
what's going on under the hood.
}}}
What might have made the difference so noticeable in this case is the
usage of many Advanced Custom Fields (with a lot of postmeta entries -
most are empty or null).
Since then I have migrated them to pods (using "real" table storage
outside postmeta) so the difference is not pronounced.
I hope this helps.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/56598#comment:12>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list