[wp-trac] [WordPress Trac] #60375: Site Transfer Protocol
WordPress Trac
noreply at wordpress.org
Thu Feb 1 00:54:22 UTC 2024
#60375: Site Transfer Protocol
-------------------------+------------------------------
Reporter: zieladam | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Import | Version:
Severity: normal | Resolution:
Keywords: | Focuses:
-------------------------+------------------------------
Comment (by dmsnell):
> Is Site Transfer a direct Host <-> Host operation with optional support
for .zip uploads? Or is it an export&download -> upload&import operation
built with future Host<->Host exchange in mind?
The more I consider it the more I see these as the same thing, whereby the
ZIP format is the means through which we normalize the transfer. I could
be totally overlooking obvious things here though, so I would like to know
where this idea makes no sense.
Given the VFS-like interface we have with ZIPs, I imagine that if a site
only wants to import posts and not media then it will skip the part of the
ZIP containing the `wp-contents` file.
Maybe this is asking too much of the remote site, to regenerate a ZIP on
the fly for specific parts. The challenges I'm capable of seeing at the
moment are all more related to whether we ship `wp-content` assets the
same way we ship database and config values. It's all about bulk data and
less about the destination of the transfer.
> WXR. However, when would WXR hold both content AND metadata? On site
export the content would be in the database so the WXR file would only
carry the metadata – at which point it's wouldn't have almost anything in
common with WXR as we know it. On the upside, WXR can be streamed with the
upcoming XML API and you can also edit them with a text editor.
I'm still completely on the fence about this too. Of course there'd be
duplication of content in the WXR vs. the tables, but I see the database
as the authoritative source for non-asset content while the WXR could be a
reasonable signaling protocol to guide the import.
Some part of me wants to remove the content from the WXR, but if we do
that we potentially lose a lot for older systems and for our ability to
easily inspect the export. 🤔
Even if we have post content in the WXR it will lack the meta information
unless we also export it there as well, which I guess we could do, and
even remove those rows from the sqlite database 🤷♂️
Something big still seems to be missing that I haven't seen yet on all
this, but I think we're starting to get a better handle of the space by
asking all these questions and figuring out how it could all go wrong.
> In direct Host <-> Host transfer we need an entire world of error
handling logic.
Yes, but also if "the Playground ZIP" is the transfer format then it's
indistinguishable from importing a ZIP from a local disk, other than the
bytes are arriving over the network. yeah we'll need another layer of
error handling, but we should be able to restart the ZIP mid-sequence on
the source site.
makes me thing that one preliminary step we'd need for this, to make it
reentrant, is to create a relatively small manifest on the source site to
start the process. this could do a number of things:
- generate content hashes for all relevant media or database tables. this
might involve some way of snapshotting the data.
- generate a list of media files and their content hashes
- sequence the files for the ZIP stream.
after this the source site can reference that manifest to virtually
deliver the ZIP stream mid-sequence without having to scan all the data on
its own disk. this manifest would roughly correspond in size to the number
of files and database objects, but it could itself be a kind of journaling
snapshot of a site - maybe there's a tie-in with other
snapshotting/concurrent work on this
--
Ticket URL: <https://core.trac.wordpress.org/ticket/60375#comment:9>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list