[wp-trac] [WordPress Trac] #17210: Massive duplication of oEmbed postmeta
WordPress Trac
wp-trac at lists.automattic.com
Thu Apr 21 23:38:36 UTC 2011
#17210: Massive duplication of oEmbed postmeta
--------------------------+-----------------------------
Reporter: archon810 | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Embeds | Version: 3.1
Severity: normal | Keywords:
--------------------------+-----------------------------
Hey guys,
Ever since my blog grew to a considerable size (a few million PVs a month)
and started slowing down and exploding my server, I've been looking and
implementing various optimizations. During one such passes through the
data, I noticed this really weird oEmbed related behavior, which I've been
observing for a number of WP version upgrades.
I use [embed] shortcodes a lot, and every new post after a few minutes
ends up with a ton of oembed caches that don't belong to it at all -
they're all from other posts. Posts that don't even have [embed]s at all
still have over 100 oembed entries in wp_postmeta.
Here's an example of just a small subset of data residing in the table:
[[Image(http://farm6.static.flickr.com/5230/5641419581_0610c9e267_b.jpg)]]
There are now about 150,000 entries in the wp_postmeta table due to this,
half of which are duplicated _oembed entries, which I think has heavy
impact on server load. Not only that but I'm sure WP is filling the table
up with values by redoing oEmbed queries, which may explain that load
shoots up very high at times when publishing.
{{{
mysql> select count(*) from wp_postmeta where meta_key like '_oembed%';
+----------+
| count(*) |
+----------+
| 81499 |
+----------+
1 row in set
mysql> select count(*) from wp_postmeta;
+----------+
| count(*) |
+----------+
| 148451 |
+----------+
1 row in set
}}}
Just look at how many times this random video embed value shows up in the
table. I'm sure it was used in only one actual post:
{{{
mysql> select count(*) from wp_postmeta where meta_value like '%p2oWELcd-
lI%';
+----------+
| count(*) |
+----------+
| 815 |
+----------+
1 row in set
}}}
Just to clarify - I don't have 815 updates to a single post that may have
explained this - these are completely unrelated, separate, published
posts.
To put things in perspective, here are the top 20 offenders:
{{{
mysql> select distinct meta_key, count(*) as cnt from wp_postmeta where
meta_key like '_oembed%' group by meta_key order by cnt desc limit 20;
+------------------------------------------+-----+
| meta_key | cnt |
+------------------------------------------+-----+
| _oembed_5607e41abb700707540a854ae76182cf | 864 |
| _oembed_984bc07d3bc0f61b6b35230cd2fa7ced | 859 |
| _oembed_da8ae36275b4576cfcd92c0ed455be96 | 859 |
| _oembed_71dd4068a9a6911f50dbe57b3ff477c5 | 858 |
| _oembed_9f817e820c23ccbfac9b22b3474e5dd3 | 858 |
| _oembed_f3c1c03a81bc301b5f1a063f65119328 | 857 |
| _oembed_31bf10d95cb7c8e9f646d9d6e5728da0 | 857 |
| _oembed_25d0ebf59c994050cb604900cf04f53f | 856 |
| _oembed_6265dae657e38579c0a8ddb66132d526 | 852 |
| _oembed_562dd8c13888905cbd15dbd74e8699cc | 849 |
| _oembed_30ea17d1cc73acd925a74373d2be32ec | 848 |
| _oembed_87f16916b4da6571f454266bfbfaebe0 | 847 |
| _oembed_9f1f038d43e973bd60929201eee24f57 | 843 |
| _oembed_d46317d44fe11c0d90ef2cc3b45bce57 | 843 |
| _oembed_b1f8685ba405feee46baf9408eb632f7 | 841 |
| _oembed_1b56f492eba4c4ea698d816d0ecf2d51 | 840 |
| _oembed_fe597714de4081e6e7e78a88256c7db4 | 840 |
| _oembed_fb843e7b604cbc4e1ffa144d4eb300c8 | 839 |
| _oembed_97b12f2f1e59ee6eff95c61095aa5bef | 838 |
| _oembed_2b94d9f7c28ee37bfbead0a622c8be85 | 838 |
+------------------------------------------+-----+
20 rows in set
}}}
I'm quite at a loss here and would appreciate the next debugging steps. I
haven't been able to determine where things are going wrong on my own.
Thank you.
P.S. The site in question is AndroidPolice.com
--
Ticket URL: <http://core.trac.wordpress.org/ticket/17210>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list