[wp-trac] [WordPress Trac] #27896: wordpress-importer's lack of understanding of XML Namespaces causing compatibility issues

WordPress Trac noreply at wordpress.org
Tue Jun 6 21:13:30 UTC 2017


#27896: wordpress-importer's lack of understanding of XML Namespaces causing
compatibility issues
--------------------------+------------------------------
 Reporter:  tomdxw        |       Owner:
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  Import        |     Version:  3.9
 Severity:  normal        |  Resolution:
 Keywords:  has-patch     |     Focuses:
--------------------------+------------------------------
Changes (by pbiron):

 * keywords:  needs-patch => has-patch


Comment:

 I just uploaded 2 patches against `0.6.3`:

 1.
 [[https://core.trac.wordpress.org/attachment/ticket/27896/27896.diff|27896.diff]]
 is unrelated to making `wordpress-importer` namespace aware, but I
 discovered while writing namespace awareness patch that termmeta is
 **not** imported when the `WXR_Parser_XML` parser is used.  This might
 justify it's own ticket.  Just let me know and I'll create that.
 1.
 [[https://core.trac.wordpress.org/attachment/ticket/27896/27896.1.diff|27896.1.diff]]
 makes both `WXR_Parser_SimpleXML` & `WXR_Parser_XML` fully namespace
 aware.  It assumes that
 [[https://core.trac.wordpress.org/attachment/ticket/27896/27896.diff|27896.diff]]
 has been applied.  As noted in a comment I added to `WXR_Parser_Regex`: it
 is not worth (or probably even possible) to do fully namespace aware XML
 parsing with regexes.

 The mods in
 [[https://core.trac.wordpress.org/attachment/ticket/27896/27896.1.diff|27896.1.diff]]
 that apply to `WXR_Parser_SimpleXML` are fairly simple (pun intended) to
 understand and I don't think need any further explanation.

 The mods in
 [[https://core.trac.wordpress.org/attachment/ticket/27896/27896.1.diff|27896.1.diff]]
 that apply to `WXR_Parser_XML` deserve a little explanation.

 1. The parser is created in namespace-aware mode by calling
 [[http://php.net/manual/en/function.xml-parser-create-
 ns.php|xml_parser_create_ns()]] instead of
 [[http://php.net/manual/en/function.xml-parser-
 create.php|xml_parser_create()]].
 1. When parsing in namespace-aware mode,
 [[http://php.net/manual/en/book.xml.php|XML Parser]] passes a "namespace-
 qualified" tag name to the callables registered with
 [[http://php.net/manual/en/function.xml-set-element-
 handler.php|set_element_handler()]] (i.e., `WXR_Parser_XML::open_tag()`
 and `WXR_Parser_XML::close_tag()`).
    1. That is, the tag name is of the form `URI:tag`, e.g.
 `http://wordpress.org/export/1.2/:term` (instead of `prefix:tag`, e.g.,
 `wp:term` when running in non-namespace-aware mode).

 It might also be useful to write an
 [[http://php.net/manual/en/book.xmlreader.php|XMLReader]]-based parser as
 well.  I can work on that (tho probably not for a couple of weeks) if
 others think it would be a good thing.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/27896#comment:4>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list