[wp-trac] [WordPress Trac] #51092: Create a JSON schema for Privacy and Other Related Disclosures

WordPress Trac noreply at wordpress.org
Sun Sep 6 12:45:26 UTC 2020


#51092: Create a JSON schema for Privacy and Other Related Disclosures
----------------------------------------------+-----------------------
 Reporter:  carike                            |       Owner:  (none)
     Type:  feature request                   |      Status:  new
 Priority:  normal                            |   Milestone:  5.6
Component:  Privacy                           |     Version:
 Severity:  normal                            |  Resolution:
 Keywords:  needs-privacy-review 2nd-opinion  |     Focuses:  rest-api
----------------------------------------------+-----------------------

Comment (by TimothyBlynJacobs):

 > Wrong. Privacy policies cannot be compiled by "(other) tools". They have
 to be written by people or businesses/companies who will be legally
 responsible for the content.

 There are services like this that already exist, for instance iubenda. A
 tool like that could consume the data over the REST API and provide much
 more accurate data as to what the site does with the data.

 > However the example "schema" above has a lot of things that don't seem
 "privacy related", needs more work.

 I agree.

 > So we will need to maintain/sync two different "schemas", one in core
 and another on wp.org.

 I imagine .org would use the schema in its WordPress install and we
 wouldn't be breaking BC, just adding new features or changing their format
 in BC ways so it being on trunk wouldn't be an issue.

 > Then plugins will be "forced" to include a (json formatted) file that
 will have to contain all the "required fields" or will be marked as
 "failed", even when they do not contain any user privacy related stuff?
 That seems... not ideal.

 Something for @carike. I imagine they'd say "Privacy information not
 available". Which doesn't seem to bad. The minimal set of fields that
 you'd probably want to provide for a plugin that has zero privacy impact
 is probably something like:


 {{{
 {
   "ppiExport": true,
   "ppiErausre": true,
   "consentAPI": true,
   "disclosuresTab": true,
   "permissionsTab": true
 }
 }}}

 I don't think that is too much of a burden for plugin authors to
 explicitly declare that they don't need to implement those features and I
 don't really see how else we could do it short of code analysis which
 would be get us a lot less accurate data.

 > The majority of plugins have nothing to do with user privacy.

 Definitely, but I think there are still quite a number. Particularly of
 the most popular plugins.

 - WP Http 10k: https://wpdirectory.net/search/01EHHMBF85N0WSNBAVWNBG32P0
 - Cookies 3k: https://wpdirectory.net/search/01EHHMCYPW7MXY0HS07S4XNXCG
 - User Meta 4.5k:
 https://wpdirectory.net/search/01EHHMN2EV0QYPW0RXQV26AKNA

 And I think the ones that couldn't possibly have any privacy impact will
 be evident from the description. For the ones where it isn't so clear, the
 ability to say no this plugin doesn't contact any external APIs, etc...
 would be a good thing for those plugins I think.

 > Even if not cached, the validation will (likely) fail every time the
 schema is updated. Then all existing plugins will "fail"...

 Why?

 If it is from a technical implementation I imagine a function signature
 like this `wp_get_plugin_privacy_data( $plugin, $force_revalidate = false
 ): array|WP_Error`. If the plugin's privacy data has changed or the
 version of the schema is newer, we'd revalidate before returning that
 data.

 If it is from a perspective of making changes to WordPress' schema we'd
 make any changes backward compatible, the same way we currently do. I
 don't think it would be acceptable for their to be BC breaks there, nor do
 I imagine why we'd need them. Fields aren't currently marked as `required`
 and if a new format is necessary, this can be accommodated in the schema
 definition.

 > Making the data supplied by plugins "public" on a specific site will at
 least disclose which plugins that site is using. This in itself can be
 seen as a "privacy breach", can be used for "fingerprinting", the plugin's
 versions will probably be "guessable" from the data, etc. :)

 Where would that be disclosed? There would be a machine readable .json
 file in the plugin directory's folder, but you'd need to know the site is
 running that plugin before hand. It is also already trivial to detect
 because of readme files, version history, etc... And is already possible
 using sites like Built With.

 > Right, so the data contained in the plugin's json files would be
 "private" (on a per site basis) and only site owners will be able to see
 it? (Only the site owners will need to see it anyways as it is intended
 for creating a Privacy Policy). Or am I reading this wrong?

 Yep! That matches my understanding.

 > It's not that it is not a part of it but... Would you add an end point
 to output /readme.txt or /license.txt? Does it make sense from "restful"
 point of view?

 I'd like to yeah. You can use `api.wordpress.org` for .org hosted plugins,
 but for non .org plugins it makes retrieving that data impossible. We now
 have a plugins endpoint that returns the plugin header information, but
 that is limited.

 > What's the point of having that in the REST API (considering that this
 data would be very rarely accessed and used only by site owners/users with
 the highest permissions).

 We have a settings endpoint and a plugins endpoint that are only
 accessible to administrators. I'd also wager for most WordPress sites the
 admin is the only user on the whole install :)

 > As far as I understand it the (compiled) data from all the plugins json
 files can be outputted by the REST API, in case a plugin might want to
 replace the (proposed) page in wp-admin (instead of extending it), but...
 At the end this is the same like outputting all the data for the Comments
 page for example, just because a plugin might eventually decide to replace
 it? Seems WP may get there one day but...?

 I don't really get the resistance to making versioned, structured data
 that is at least in part dynamic available over a tool that is designed
 for doing that.

 As a whole, IMO we should be thinking about how new features can integrate
 with the REST API from the outset of how that feature is being designed.
 It makes implementation a lot simpler that way and as everything in WP-
 Admin is moving to a React powered interface, necessary at the moment.

 In terms of use cases for Core, if we made this available in Gutenberg
 when editing the Privacy Policy page similar to some of the initial
 mockups for how that page could work in the Classic Editor, making that
 available over REST would vastly simplify the implementation.

 The same is true for plugin authors who are building tools. And as I
 mentioned earlier, I think this would be great functionality for external
 systems like Iubenda.

 I also do think there is privacy data that would make sense to make
 public, for instance this could serve as the source of truth for cookies.
 That would be useful to access on the front-end to build a cookie consent
 screen.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/51092#comment:34>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list