[wp-trac] [WordPress Trac] #21914: Improve pingback page block parsing

WordPress Trac wp-trac at lists.automattic.com
Wed Sep 26 21:44:34 UTC 2012


#21914: Improve pingback page block parsing
----------------------------------------+--------------------
 Reporter:  Otto42                      |       Owner:
     Type:  enhancement                 |      Status:  new
 Priority:  normal                      |   Milestone:  3.5
Component:  Pings/Trackbacks            |     Version:  3.4.2
 Severity:  normal                      |  Resolution:
 Keywords:  has-patch needs-unit-tests  |
----------------------------------------+--------------------

Comment (by Otto42):

 I know this is hard to test, so I attached a demo file. Basically, I
 pulled out most of the filtering process (the relevant bits) into a
 function with both methods (1 and 2 = before and after). I also included a
 big chunk of content from one of my latest blog posts, but with the
 addition of a link at the end, to simulate the link in a pingback. Then I
 show the before and after results. Run it on the command line to get the
 difference.

 The result looks like this:

 {{{
 Before:
 [...]to comment about anything I missed, or what you see most often,
 especially if you’re doing translations yourself. Related
 posts:Internationalization: You’re probably doing it[...]

 After:
 [...]to comment about anything I missed, or what you see most often,
 especially if you’re doing translations[...]
 }}}

 Essentially, the first method (what we have now) is excessively sensitive
 to HTML spacing. If you have compressed HTML, without a lot of newlines,
 then it tends to bunch things together. When links are at the end of the
 content (where it's likely to be for things like credit links), then
 plugins which stick random sharing HTML and such at the end of the content
 can be perceived as part of the paragraph, and come out in the resulting
 excerpt.

 By eliminating the space requirement, and allowing ending /p's and other
 blocks to delimit paragraphs as well, the result is sometimes shorter, but
 definitely cleaner. Most of the time, the results are basically the same
 when the link is higher in the body. This mainly helps when the link is
 down towards the end of the content.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/21914#comment:3>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list