[wp-trac] [WordPress Trac] #49627: oEmbedding - Privacy by design - Inconsistencies & Legal concerns

WordPress Trac noreply at wordpress.org
Thu Mar 12 11:41:19 UTC 2020


#49627: oEmbedding - Privacy by design - Inconsistencies & Legal concerns
-----------------------------------------------+---------------------------
 Reporter:  arena                              |      Owner:  (none)
     Type:  defect (bug)                       |     Status:  new
 Priority:  normal                             |  Milestone:  Awaiting
                                               |  Review
Component:  General                            |    Version:
 Severity:  normal                             |   Keywords:
  Focuses:  ui, javascript, rest-api, privacy  |
-----------------------------------------------+---------------------------
 oEmbedding is now driven by two files :

 * {{{wp-includes/class-wp-oembed.php}}}  and its list of "popular" oEmbed
 providers : array ( pattern => array( oembed_url, regex [boolean] ) )

 This list is loaded in {{{wp-settings.php}}} with this instruction :
 {{{$GLOBALS['wp_embed'] = new WP_Embed();}}}
 This list of providers is filterable with hook {{{'oembed_providers'}}},
 but the hook appears to be fired before {{{'plugins_loaded'}}} and
 {{{'init'}}} hooks.
 The modus operandi to alter the "popular" providers list is described here
 : [https://wordpress.org/support/article/embeds/] .

 * {{{wp-includes/js/dist/block-library.js}}}

 The data structure is here different due to some visual aspects
 (Gutenberg)
 Several variables with name like {{{'embedXxxxxIcon'}}} are objects
 containing svg icons
 and another list of oEmbed providers with a specific variable called
 'common' which is a set of
  -- name => settings ( title, name svg icon, keywords, description )
          => patterns

 1) the lists of oEmbed providers are not identical in php and javascript.

 2) in each list, the patterns for each oEmbed provider can be different.

 3) the once driven oEmbedding process by the php array that could be
 filtered or altered ("popular" oEmbed providers) is not possible anymore.
 If you are using the Classic editor or Gutenberg, the process to retrieve
 the data and the structure of the data are different as shown in
 screenshot1 (Classic trunk) and screenshot2 (Gutenberg 5.3).
 'Classic editor' is using {{{admin-ajax.php}}} and retrieves a specific
 json format.
 'Gutenberg' is using the rest-api and calls a url such as
 {{{'index.php?rest_route=/oembed/1.0/proxy&url....&_locale=user'}}} and
 retrieves the full oembed json from the provider.
 In the class-wp-oembed-controller, the code executed seems similar to
 Classic editor (data2html) but the result is obviously different.
 Both are caching the html code in post_meta table, 'Gutenberg' is caching
 the json result of oembed in options table.
 This means that the filter 'oembed_result' is by-passed when oembed
 content is displayed in 'Gutenberg'.

 4) if a site decides to remove one or all references to "popular" or
 "trusted" (vocabulary differs in code comments) oEmbed providers or add
 its own, this is not possible anymore when using Gutenberg, because the
 javascript providers list is hard-coded and cannot be altered.
 For obvious local regulations, it is important to be consistent and
 coherent in the oembedding policy choices of each site for each site.
 This is a regression and an infringement to the "Privacy by design" policy
 adopted by wordpress open source project.

 5) i am not going to discuss how the "popular" providers got elected by a
 "popular" vote or an electoral college and if they had a superpac to get
 elected (or more accurately "be referenced").
 On some websites, these arbitrary choices that cannot be altered may
 contervene to website policy e.g. educational blogs for kids, or local
 laws.
 In some countries, this can be considered as political [ethymology
 definition : "life in the city"] choice and can put at risk wordpress
 users : loading in their browsers references to forbidden sites.
 Surveillance is everywhere and not all countries are democracies.

 6) it is important, that in WordPress, all "popular" oEmbed providers
 removed or added by filter, be removed or added in any form of code
 (Gutenbeg specific blocks included) (privacy by design).
 This implies a unique reference list of providers (filterable/alterable in
 php) with some extra information for visual aspects under Gutenberg.
 The javascript variables must be generated (or not if providers is an
 empty array) from the php list (with some default values if extra
 information is not provided).
 To simply and easily :
         a) add a new provider on both sides and specific svg icons.
         b) remove a "popular" provider with an easy
                 {{{wp_oembed_remove_provider( 'tactic' ); }}}
            rather than
                 {{{wp_oembed_remove_provider(
 '#https?://(www\.)?tactic\.com/.*/video/.*#i' );}}}
         c) {{{wp_oembed_remove_all_providers()}}} is also welcome.

 7) it could be interesting to offer optional and intermediate solutions to
 oEmbedding as it is processed today (privacy by design) :
         a) insert specific html ( usually iframe : fully intrusive)
         b) insert link to url with thumbnail when available {{{<a
 href=...><img src=thumbnail_url ... /></a>}}} (less intrusive)
         c) insert link as text {{{<a href=url ...>(some title or text if
 available or url)</a>}}} (no oEmbedding : not intrusive)
 Requires to cache the complete json answer indifferently ('Classic' or
 'Gutenberg').

 8) it could be interesting to encapsulate the oembedding result in a
 custom html5 element (privacy by design)
 such as {{{<wp-oembed> ... </wp-oembed>}}} to allow theme developpers to
 add specific js on embedded content.

 9) The oEmbed providers listed in wordpress are referenced by wordpress.
 Definitely change the vocabulary used in the code comments.
 Do not use "popular" or "trusted" providers but 'referenced providers',
 because this is the most accurate definition.


 Regards

 related #43713,

 related https://wordpress.org/support/article/embeds/


 ps : as a remainder, GDPR chapter IV title is Controller and Processor.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/49627>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list