[wp-hackers] pulling a massive HTML site into Wordpress

Alex Andrews awgandrews at gmail.com
Mon Jun 6 11:03:04 UTC 2011

Sorry to be confusing, Ruby is a totally different programming
language from PHP, just ignore me, if you don't know PHP this isn't
helpful to even mention it.

Sadly not, because HTML pages are in no standard format that a given
plugin could guess. The only possibility would be extracting the raw
text from the HTML, but because this is the sort of operation that
you'd have to do in a script anyway, and still then you'd need to
write something to add them all automatically to WordPress, then you
might as well cut out the middle man and parse the HTML directly.

Remember, in learning how to code a script then coding one, you'll
save time anyway, compared to copy and pasting anyway.

All the best,


On 6 June 2011 10:50, John Black <immanence7 at gmail.com> wrote:
> Posts is more what I need, so no problem there.
> On the PHP hacking, I'm more than willing to try. But starting from zero is a big climb. In reality, I can barely get my head around HTML, never mind PHP! I wish there were step by step instructions out there! I'm a good soldier! I can follow orders!
> I've heard of Ruby (*red face*). Would it help here?
> Before getting down to the nitty gritty of the PHP hack, Is there any other path, say from HTML to X and then X to Wordpress that would be more doable?
> JB
> On 6 Jun 2011, at 13:30, Alex Andrews wrote:
>> Yup sadly there is no way of doing this without hacking some PHP. But
>> Baki's instructions are entirely correct - you could do it as a
>> command line tool. I'm not sure about the latter instruction - posts
>> and pages, as far as the database is concerned, are basically the same
>> thing.
>> I did something similar not long ago, using Ruby to do it, for fun.
>> Alex
>> On 6 June 2011 11:23, Baki Goxhaj <banago at gmail.com> wrote:
>>> Well, if you don't know some PHP I don't know how you are going to do it,
>>> but her is my advice.
>>> 1. Use http://simplehtmldom.sourceforge.net/ to pull content and from the
>>> old site and map it accordingly to WordPress fields
>>> 2. Use a custom script to insert posts - like the one you quoted above that
>>> makes use of wp_insert_posts() function
>>> 3. Import content e posts rather than pages as so much pages don't scale and
>>> will kill your site
>>> Good luck.
>>> Kindly,
>>> Baki Goxhaj
>>> www.wplancer.com | proverbhunter.com | www.banago.info<http://proverbhunter.com>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers

More information about the wp-hackers mailing list