[wp-hackers] Importing HTML files as pages -- been done?
Mike Schinkel
mikeschinkel at newclarity.net
Sun Feb 8 23:23:48 GMT 2009
Dougal:
BTW, I just checked your link and found phpQuery & DomQuery which both looks very cool. Thanks for the link, I'll definitely be reviewing those in the future.
-Mike Schinkel
http://mikeschinkel.com/
----- Original Message -----
From: "Mike Schinkel" <mikeschinkel at newclarity.net>
To: wp-hackers at lists.automattic.com
Sent: Sunday, February 8, 2009 6:11:53 PM GMT -05:00 US/Canada Eastern
Subject: Re: [wp-hackers] Importing HTML files as pages -- been done?
"Dougal Campbell" <dougal at gunters.org> wrote:
> No, a DOM-based approach is definitely better than regex.
> Regexes for parsing HTML can get *extremely* complicated,
> and if you start trying to write a regex-based parser from
> scratch, you'll almost certainly miss some things.
I agree, in general. In her specific case she said that she'd have enclosing <div>s with unique IDs identifying the content to select. That <div> would be easy to find even with strpos() and then from there a simple loop to find the applicable closing </div> would work. Yes there are potential issues with that approach, but they would be rare. For a general purpose tool those limitations wouldn't be acceptable but for a quick & dirty tool to accomplish a specific conversion it would be sufficient and easy.
Still, for those that feel that only a DOM approach will do I'll not stand in the way by debating it further. :-)
-Mike Schinkel
http://mikeschinkel.com/
More information about the wp-hackers
mailing list