[wp-trac] [WordPress Trac] #60375: Site Transfer protocol

WordPress Trac noreply at wordpress.org
Tue Jan 30 10:29:08 UTC 2024


#60375: Site Transfer protocol
-------------------------+------------------------------
 Reporter:  zieladam     |       Owner:  (none)
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:  Awaiting Review
Component:  Import       |     Version:
 Severity:  normal       |  Resolution:
 Keywords:               |     Focuses:
-------------------------+------------------------------
Description changed by zieladam:

Old description:

> Migrating WordPress sites involves custom, error-prone logic. There are
> no canonical tools and the guidelines seem lacking.
>
> Let's:
>
> 1. Formalize a list of steps involved in transferring a WordPress site
> between hosts
> 2. Build a canonical plugin that implements those steps and enables easy
> site migrations
> 3. Merge it into WordPress core once its stable
>
> This is relevant for:
>
> * Site migrations
> * Creating and restoring site backups
> * Staging and development environments
> * WordPress Playground imports and exports
> * Moving live sites into Playground and vice versa
>
> ... probably a lot more.
>

> == ZIP bundle as the export format
>
> The [https://github.com/WordPress/data-liberation/discussions/53 Data
> Liberation proposal] makes a great argument for a ".zip" bundle as the
> export format. I would love to leverage it here. A wordpress.zip file
> with all the site files and the data in an .sqlite (or plaintext .sql)
> format sounds like the most natural and convenient way of moving
> WordPress sites around.
>
> Moving around a huge archive may be problematic for larger sites, but the
> ZIP format was built with streaming, compression, chunking, checksums,
> and seeking in mind. It is a good fit for handling imports that are many
> GBs large on a host with 64 MB of ram allocated, and not enough hard
> drive space to hold the import file itself.
>
> To support that last point – I’ve built a [https://github.com/WordPress
> /wordpress-playground/pull/880 streaming zip encoder and decoder in
> JavaScript] for Playground. It can cherry-pick a single file from
> https://downloads.wordpress.org/plugin/gutenberg.17.5.2.zip by
> transferring only a few kilobytes and without downloading the entire
> 10+MB archive. It works with zip files, and it would work with a
> Synchronization API endpoint where the zipped fragments are generated on
> demand.
>
> === Differences with WXR
>
> Unlike WXR imports this is looking to transfer a site in its entirely
> with the Transfer Protocol. The export bundle should include every
> database table, every installed plugin, every asset and file in the wp-
> content directory. It must also include meta information such as the
> domain from which the site is being exported and all custom wp-config.php
> settings. This will be necessary in order to automate the transfer.
>
> == Tasks involved in site transfer
>
> * Set IMPORTING constant so things shut down:
>     * Stop sending emails
>     * Database replication
>     * Cleanup jobs/CRON jobs that might filter on post creation
> * Communicate source and destination site domains/base URLs
> * Rewrite URLs in the database to match new site URL
> * Rewrite URLs in all files including wp-config.php, wp-content,
> sunrise.php, mu-plugins, etc.
> * Communicate wp-config.php settings, including things like WP_SITEURL
> and plugins directory, theme directory, content directory, memory limits,
> and other settings.
> * Let the target site set the database credentials.
> * Copy all content from source to destination site, including users, site
> options, database * tables.
> * Bonus if there's no post-processing via tools like `wp search-replace`.
> The transferred data would rewritten as the transfer happens (e.g. to
> adjust the site URL).
> * Bonus if we can cryptographically secure the conduit through which the
> transfer takes place to prevent someone intercepting a transfer (e.g.
> create a private/public keypair, only allow a single transfer at a time,
> use that certificate to authenticate the transfer.
> * Bonus to track transfer state, communicate progress on it, and allow
> for pausing and resuming a transfer.
> * Bonus if we can start a database transaction log via $wpdb or similar
> system when starting a transfer so that the source site can continue to
> serve requests and ensure that the destination site gets a full
> concurrent update to its data.
>
> == Challenges
>
> * This assumes a blank slate on the target site otherwise we risk
> overwriting ids or mismatching ids.
> * The right design could become a foundation for live synchronization
> between WordPress sites.
>
> == Related efforts
>
> * https://github.com/WordPress/playground-tools/pull/124
> * https://github.com/WordPress/data-liberation/discussions/53
> * https://github.com/WordPress/wordpress-playground/pull/880
>
> Co-authored with @dmsnell
>
> cc @dufresnesteven @berislavgrgicak @tellyworth @dd32 @barry @payton
> @peterwilsoncc @swissspidy @miyarakira @matt @youknowriad @mamaduka
> @aristath

New description:

 Migrating WordPress sites involves custom, error-prone logic. There are no
 canonical tools and the guidelines seem lacking.

 Let's:

 1. Formalize a list of steps involved in transferring a WordPress site
 between hosts
 2. Build a canonical plugin that implements those steps and enables easy
 site migrations
 3. Merge it into WordPress core once its stable

 This is relevant for:

 * Site migrations
 * Creating and restoring site backups
 * Staging and development environments
 * WordPress Playground imports and exports
 * Moving live sites into Playground and vice versa

 ... probably a lot more.


 == ZIP bundle as the export format

 The [https://github.com/WordPress/data-liberation/discussions/53 Data
 Liberation proposal] makes a great argument for a ".zip" bundle as the
 export format. I would love to leverage it here. A wordpress.zip file with
 all the site files and the data in an .sqlite (or plaintext .sql) format
 sounds like the most natural and convenient way of moving WordPress sites
 around.

 Large sites may seem problematic at the first glance, as 300GB zip
 archives are difficult to manage. However, the ZIP format was built with
 streaming, compression, chunking, checksums, and seeking in mind. It is a
 good fit for handling imports that are many GBs large even on a host with
 64 MB of ram allocated and not enough hard drive space to hold the import
 file itself.

 To support that last point – I’ve built a [https://github.com/WordPress
 /wordpress-playground/pull/880 streaming zip encoder and decoder in
 JavaScript] for Playground. It can cherry-pick a single file from
 https://downloads.wordpress.org/plugin/gutenberg.17.5.2.zip by
 transferring only a few kilobytes and without downloading the entire 10+MB
 archive. It works with zip files, and it would work with a Synchronization
 API endpoint where the zipped fragments are generated on demand.

 === Differences with WXR

 Unlike WXR imports this is looking to transfer a site in its entirely with
 the Transfer Protocol. The export bundle should include every database
 table, every installed plugin, every asset and file in the wp-content
 directory. It must also include meta information such as the domain from
 which the site is being exported and all custom wp-config.php settings.
 This will be necessary in order to automate the transfer.

 == Tasks involved in site transfer

 * Set IMPORTING constant so things shut down:
     * Stop sending emails
     * Database replication
     * Cleanup jobs/CRON jobs that might filter on post creation
 * Communicate source and destination site domains/base URLs
 * Rewrite URLs in the database to match new site URL
 * Rewrite URLs in all files including wp-config.php, wp-content,
 sunrise.php, mu-plugins, etc.
 * Communicate wp-config.php settings, including things like WP_SITEURL and
 plugins directory, theme directory, content directory, memory limits, and
 other settings.
 * Let the target site set the database credentials.
 * Copy all content from source to destination site, including users, site
 options, database * tables.
 * Bonus if there's no post-processing via tools like `wp search-replace`.
 The transferred data would rewritten as the transfer happens (e.g. to
 adjust the site URL).
 * Bonus if we can cryptographically secure the conduit through which the
 transfer takes place to prevent someone intercepting a transfer (e.g.
 create a private/public keypair, only allow a single transfer at a time,
 use that certificate to authenticate the transfer.
 * Bonus to track transfer state, communicate progress on it, and allow for
 pausing and resuming a transfer.
 * Bonus if we can start a database transaction log via $wpdb or similar
 system when starting a transfer so that the source site can continue to
 serve requests and ensure that the destination site gets a full concurrent
 update to its data.

 == Challenges

 * This assumes a blank slate on the target site otherwise we risk
 overwriting ids or mismatching ids.
 * The right design could become a foundation for live synchronization
 between WordPress sites.

 == Related efforts

 * https://github.com/WordPress/playground-tools/pull/124
 * https://github.com/WordPress/data-liberation/discussions/53
 * https://github.com/WordPress/wordpress-playground/pull/880

 Co-authored with @dmsnell

 cc @dufresnesteven @berislavgrgicak @tellyworth @dd32 @barry @payton
 @peterwilsoncc @swissspidy @miyarakira @matt @youknowriad @mamaduka
 @aristath

--

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/60375#comment:5>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list