[wp-hackers] Making WP more secure the evolutionary way
Jacob Santos
wordpress at santosj.name
Wed Jan 28 01:27:29 GMT 2009
Most of the time in WordPress is not spent in the DB layer, which is the
way it should be (source: past cachegrind benchmarks for WP 2.5 / WP
2.6[1]). My predictions are based on the fact that when you do anything
like looping with code branches, it will be slower to build the string
to pass to the DB query, than just passing the SQL string to the the DB
query. If you want benchmarks, then I'll have to say we'll have to wait
until your suggestion has an implementation to test before and after.
The problem with saying that this is for 'security' is that security is
already being handled in the API, if there is a place that is missing
input or output validation / sanitization, then create a ticket, so it
can be fixed.
The problem with caching techniques is that while WordPress utilizes
caching for the results, the results are, in most cases, not saved
unless there is a plugin installed that saves the cached results to
memcache, file or another persistent memory caching. Therefore, I mean,
if you did use it, it really wouldn't make that much of a difference.
Faster is a relative term, because in some systems the file system is
actually slower than the database and saving to a file does not benefit
you any.
The comparisons of Zend, CodeIgniter, etc are valid, because they exist
now. Not that WordPress can use them (it would be interesting to use the
WordPress API in CodeIgniter) or will use them. Just that basically,
that abstracting something usually does not prove acceptable in the end.
I will use Smarty as an example. The purpose of Smarty was to move the
nasty PHP code away from the Designers, but in turn created another
meta-language that both designers and programmers had to understand. It
also created a situation where extending Smarty also had a learning
curve. In trying to solve a small problem with separating business logic
and presentation, they created a bigger problem of having to learn the
new system. Not to mention the multitudes of other systems for
"templating" that popped up around the same time that worked differently
in many cases.
There is also no guarantee that programmers outside of WordPress core
will use the select, insert, update, and delete methods. Most novices
can't wrap their heads around prepare(), sanitize, and validation (curse
you regex!). So the developers and contributors of WordPress have worked
hard to use prepare in all places in the core code that it needs to be
(again, if there is a place that isn't using it and should be, then file
a ticket). Converting the code over to using these methods will not be a
quick task and most likely will confuse those looking over the code as
to what it is trying to do.
When you have a language such as SQL, err, MySQL SQL at least, it has a
standard that (MySQL) programmers understand. When you create an method,
or set of methods to abstract it, then you create a situation where the
person who understands SQL now has to learn your system for how you
abstract SQL. Trust me, it took me several days to fully grasp WTF was
going on in CodeIgniter (You have the WTF where the method where_or()
actually has to come after the where(), so basically it prepends
insteads of appends the 'OR'). I'll rather not have bugs that wouldn't
exist otherwise be caused from abstracting.
For security, you have to put care in it and watch over it.
For me, the insert(), update(), and delete() methods is something I
believe would be nice to have as it will simplify my life. Usually, I
end building the SQL based on an array anyway, in which case insert or
update would be nice to have as it would build the SQL for me. Delete
SQL is often extremely simple with a simple: table, and where clause
with limits in a lot of cases. If you support those, then you support at
least 90% of the SQL out there.
I'm just a little wary of abstracting select SQL statements, because
there is a lot of deviation.
2) Why the hell are you creating a God Object? Do you want to create a
smite method to bring down the fire on the asses of jackasses?
I liked the:
$objPost = new WP_Posts();
$posts = $objPost->fetchAll('arguments'); // -> $wpdb->query('GET Posts
based on WP_Query');
foreach( $posts as $post ): endforeach;
What I hate more than anything is the Blob anti-pattern that many
object-oriented plugins use (curse you Subscribe2).
The point is that, you want to keep functionality separated from other
separate functionality. So the posts queries are in their own object
(right now in their own functions) and comment queries are in their own
object (right now, in functions). I think most of the functionality in
WordPress follows the CRUD (anti-)pattern in many ways.
Jacob Santos
1. Disclaimer: The purposes of the benchmarking had nothing to do with
the DB, just basically areas that had the highest overhead and quite a
few were done to test the performance of the plugin API. None of the
instances were specifically targeted for the DB API, if something did
come up, there is basically nothing that can be optimized, so the API
was ignored and forgotten.
Florian Thiel wrote:
> On Tue, Jan 27, 2009 at 8:14 AM, Otto <otto at ottodestruct.com> wrote:
>
>> On Mon, Jan 26, 2009 at 8:45 PM, Mike Schinkel
>> <mikeschinkel at newclarity.net> wrote:
>>
>>> P.S. If you advocates persist in moving in this direction, please do us a favor and at least write an query client that can understand it's syntax and allow a user to query MySQL interactively directly from your code.
>>>
>> That's the beauty of overloading. Remember that Zend DB stuff I was
>> talking about before? Using that, you can just as easily query with
>> SQL directly. $db->fetchAll('select whatever') works just as well
>> there too. The API functionality is meant to add to the base, not take
>> away from it. Sometimes it's better to build a query dynamically, in
>> parts. Sometimes, it's not.
>>
>
> Since this comes up again and again: Does anyone have benchmarking
> results for WordPress and can tell me where most of the time is spent
> when generating a page? Performance is not that straightforward, and
> it is really hard to predict how some change impacts the execution
> speed unless you really measure it. Also, most DB API have their own
> prepare() method that precompiles the SQL statement. When you reuse a
> statement (with different parameters like WHERE conditions and such)
> it actually becomes FASTER.I think Zend or CakePHP are a bit out of reach for WordPress right now
> because it would need some fundamental changes to the code. Zend and
> CakePHP heavily use objects and a data model, which is not explicit in
> WordPress.
>
> I think there are two ways to go about it which don't straitjacket the
> developers and provide a reasonable set of advantages:
>
> 1) just package the raw SQL queries in function calls like this:
>
> $wp_db->select($columns,$conditions)
> e.g.
> $wp_db->select(array('column1','column2'),array('id' => $someid))
>
> This is already being used for insert and update. Moving all the
> INSERTs, UPDATEs and the simple cases of SELECT, DELETE etc. to the
> abstraction layer already rids us of 200 uses of raw SQL. And it's
> really not much work. (these are the 'method_exists' and
> 'trivial_implementation' cases). For the rest, we could see where we
> would go from there. If it works out well for the simple cases, we
> might also consider the ones that actually need some code to be
> written...
>
> Matt, Jacob is this something you would approve?
>
> I'll send proof-of-concept patches for this types later today...
>
> 2) Create "intentional" abstractions (This is an idea that goes much
> further and moves SQL out of the picture completely; I see that there
> are people that won't like this approach; it's just an idea).
>
> The basic point of these abstractions is that the caller just says
> what he wants and does not really care how the callee gets the job
> done.
>
> $wp_db->getPosts()
> $wp_db->getCommentsForPost($id)
> $wp_db->getPostsByTag($tag)
>
> The problem with this approach is that there could be too many
> functions needed because you would need one for every particular type
> of query. You could unify some query types using keyword arguments
> (like adding 'limit' => 10) but that could hurt readability.
>
> Florian
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>
More information about the wp-hackers
mailing list