[wp-trac] [WordPress Trac] #25387: Autoembeds don't work with paragraphs

WordPress Trac noreply at wordpress.org
Thu Oct 3 17:56:29 UTC 2013


#25387: Autoembeds don't work with paragraphs
--------------------------+------------------------------
 Reporter:  Looimaster    |       Owner:
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  General       |     Version:  3.6.1
 Severity:  normal        |  Resolution:
 Keywords:                |
--------------------------+------------------------------

Comment (by redsweater):

 I understand the current matching is very conservative on purpose, as that
 was the tradeoff when turning on autoembed by default. But for the
 examples listed above and many others, I think the current pattern could
 be loosened up a bit to match many, many cases where the desired
 autoembedding would in fact be preferable to leaving the URL raw.

 For the sake of argument here is one proposed alternative regex to the one
 currently used in class-wp-embed.php's autoembed() function:

 {{{
 (?:(?:^[^\s"']*\s*>?)|(?:\s))(https?://[^\s<]+)
 }}}

 I am not a regex expert, but i cobbled this together. I also ran it
 against several test cases where it seems to obtain the desired behavior.
 In each of these example cases, it matches and extracts the desired URL
 and nothing surrounding it:

 {{{
 <p>
 http://www.wordpress.org/
 </p>
 }}}

 {{{
 <p>http://www.wordpress.org/</p>
 }}}

 {{{
 Some text
 http://www.wordpress.org/
 Some more text
 }}}

 {{{
 <p> http://www.wordpress.org/</p>
 <p>http://www.apple.com/ </p>
 http://www.red-sweater.com/
 }}}

 {{{
 http://www.wordpress.org/
 }}}

 Note that the pattern does NOT match obvious examples I could think of
 where a literal URL would be expected to be left alone, for example:

 {{{
 <a href="http://www.wordpress.org">WordPress</a>
 }}}

 On the other hand, there are some areas where a false-ID will still be
 made. For example if the HTML above were slightly malformed and possessed
 spaces around the URL:

 {{{
 <a href=" http://www.wordpress.org ">WordPress</a>
 }}}

 In the end I think we should weigh the current frustration over the
 malfunctioning behavior against the possible frustration of falsely
 converting URLs. It seems to me that so long as the major cases where the
 URL is clearly intended to be part of markup are covered, there will be
 very few frustratingly unwanted substitutions. That said, I get that
 without an option to disable the feature, one frustrating subsitution
 could be one too many.

 In short it feels to me that the current substitution behavior is not
 "magical" enough to justify its own existence. Requiring an author to meet
 all the caveats of the auto-embed regex match as it stands today seems
 about as onerous as requiring them to put the URL into literal [embed]
 shortcodes. If WordPress is going to support this laudible feature then I
 think it should be expanded to affect many more obvious substitution
 scenarios.

--
Ticket URL: <http://core.trac.wordpress.org/ticket/25387#comment:2>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list