[wp-hackers] Importing HTML files as pages -- been done?

Mike Little mike at zed1.com
Fri Feb 6 20:39:49 GMT 2009


2009/2/6 Stephanie Leary <steph at sillybean.net>:
> I'm thinking of working on a plugin to import HTML files as pages,
> ideally preserving the directory structure in the page hierarchy.
> ...
> I've Googled a bit and searched Trac, and haven't found any other
> projects like this, only users asking for it. :) If you know of anyone
> already working on the problem, please let me know.
> Stephanie
> _______________________________________________


Hi Stephanie and others on this thread,

I will probably release a WordPress importer in the next few hours
which takes an XHTML hierarchy of files and imports it into WordPress
as pages.

It was written to import the XHTML output of the DITA[1] Open
Toolkit[2] a tool which takes XML topics in DITA format and converts
them to a number of formats, including PDF, Win Help, and XHTML. It
uses the body tag to grab what it needs.

It is very rough and specific to the use it was put to in my last
company. It also works on WPmu (in fact it might only work on mu now).

It uses PHP5's xml manipulation and has some quirky stuff in it. For
instance importing 1200 files in one go on windows used to always time
out (PHP timeout calc on windows uses wallclock time not cpu time), so
it can be restarted and it will process from where it left off.

I'll tidy it a little, put together some documentation, and try to get
it released this evening. Otherwise this weekend for sure.

I mentioned it and DITA over in this post [3] on WP-Docs as part of
this conversation [4].


Mike
-- 
Mike Little
http://zed1.com/

[1] http://dita.xml.org/book/getting-started
[2] http://dita.xml.org/wiki/the-dita-open-toolkit
[3] http://comox.textdrive.com/pipermail/wp-docs/2009-January/001890.html
[4] http://comox.textdrive.com/pipermail/wp-docs/2009-January/001862.html


More information about the wp-hackers mailing list