[wp-hackers] Making WP more secure the evolutionary way

Jacob Santos wordpress at santosj.name
Wed Jan 28 01:27:29 GMT 2009


Most of the time in WordPress is not spent in the DB layer, which is the 
way it should be (source: past cachegrind benchmarks for WP 2.5 / WP 
2.6[1]). My predictions are based on the fact that when you do anything 
like looping with code branches, it will be slower to build the string 
to pass to the DB query, than just passing the SQL string to the the DB 
query. If you want benchmarks, then I'll have to say we'll have to wait 
until your suggestion has an implementation to test before and after.

The problem with saying that this is for 'security' is that security is 
already being handled in the API, if there is a place that is missing 
input or output validation / sanitization, then create a ticket, so it 
can be fixed.

The problem with caching techniques is that while WordPress utilizes 
caching for the results, the results are, in most cases, not saved 
unless there is a plugin installed that saves the cached results to 
memcache, file or another persistent memory caching. Therefore, I mean, 
if you did use it, it really wouldn't make that much of a difference. 
Faster is a relative term, because in some systems the file system is 
actually slower than the database and saving to a file does not benefit 
you any.

The comparisons of Zend, CodeIgniter, etc are valid, because they exist 
now. Not that WordPress can use them (it would be interesting to use the 
WordPress API in CodeIgniter) or will use them. Just that basically, 
that abstracting something usually does not prove acceptable in the end. 
I will use Smarty as an example. The purpose of Smarty was to move the 
nasty PHP code away from the Designers, but in turn created another 
meta-language that both designers and programmers had to understand. It 
also created a situation where extending Smarty also had a learning 
curve. In trying to solve a small problem with separating business logic 
and presentation, they created a bigger problem of having to learn the 
new system. Not to mention the multitudes of other systems for 
"templating" that popped up around the same time that worked differently 
in many cases.

There is also no guarantee that programmers outside of WordPress core 
will use the select, insert, update, and delete methods. Most novices 
can't wrap their heads around prepare(), sanitize, and validation (curse 
you regex!). So the developers and contributors of WordPress have worked 
hard to use prepare in all places in the core code that it needs to be 
(again, if there is a place that isn't using it and should be, then file 
a ticket). Converting the code over to using these methods will not be a 
quick task and most likely will confuse those looking over the code as 
to what it is trying to do.

When you have a language such as SQL, err, MySQL SQL at least, it has a 
standard that (MySQL) programmers understand. When you create an method, 
or set of methods to abstract it, then you create a situation where the 
person who understands SQL now has to learn your system for how you 
abstract SQL. Trust me, it took me several days to fully grasp WTF was 
going on in CodeIgniter (You have the WTF where the method where_or() 
actually has to come after the where(), so basically it prepends 
insteads of appends the 'OR'). I'll rather not have bugs that wouldn't 
exist otherwise be caused from abstracting.

For security, you have to put care in it and watch over it.

For me, the insert(), update(), and delete() methods is something I 
believe would be nice to have as it will simplify my life. Usually, I 
end building the SQL based on an array anyway, in which case insert or 
update would be nice to have as it would build the SQL for me. Delete 
SQL is often extremely simple with a simple: table, and where clause 
with limits in a lot of cases. If you support those, then you support at 
least 90% of the SQL out there.

I'm just a little wary of abstracting select SQL statements, because 
there is a lot of deviation.

2) Why the hell are you creating a God Object? Do you want to create a 
smite method to bring down the fire on the asses of jackasses?

I liked the:

$objPost = new WP_Posts();

$posts = $objPost->fetchAll('arguments'); // -> $wpdb->query('GET Posts 
based on WP_Query');

foreach( $posts as $post ):  endforeach;


What I hate more than anything is the Blob anti-pattern that many 
object-oriented plugins use (curse you Subscribe2).

The point is that, you want to keep functionality separated from other 
separate functionality. So the posts queries are in their own object 
(right now in their own functions) and comment queries are in their own 
object (right now, in functions). I think most of the functionality in 
WordPress follows the CRUD (anti-)pattern in many ways.

Jacob Santos

1. Disclaimer: The purposes of the benchmarking had nothing to do with 
the DB, just basically areas that had the highest overhead and quite a 
few were done to test the performance of the plugin API. None of the 
instances were specifically targeted for the DB API, if something did 
come up, there is basically nothing that can be optimized, so the API 
was ignored and forgotten.



Florian Thiel wrote:
> On Tue, Jan 27, 2009 at 8:14 AM, Otto <otto at ottodestruct.com> wrote:
>   
>> On Mon, Jan 26, 2009 at 8:45 PM, Mike Schinkel
>> <mikeschinkel at newclarity.net> wrote:
>>     
>>> P.S. If you advocates persist in moving in this direction, please do us a favor and at least write an query client that can understand it's syntax and allow a user to query MySQL interactively directly from your code.
>>>       
>> That's the beauty of overloading. Remember that Zend DB stuff I was
>> talking about before? Using that, you can just as easily query with
>> SQL directly. $db->fetchAll('select whatever') works just as well
>> there too. The API functionality is meant to add to the base, not take
>> away from it. Sometimes it's better to build a query dynamically, in
>> parts. Sometimes, it's not.
>>     
>
> Since this comes up again and again: Does anyone have benchmarking
> results for WordPress and can tell me where most of the time is spent
> when generating a page? Performance is not that straightforward, and
> it is really hard to predict how some change impacts the execution
> speed unless you really measure it. Also, most DB API have their own
> prepare() method that precompiles the SQL statement. When you reuse a
> statement (with different parameters like WHERE conditions and such)
> it actually becomes FASTER.I think Zend or CakePHP are a bit out of reach for WordPress right now
> because it would need some fundamental changes to the code. Zend and
> CakePHP heavily use objects and a data model, which is not explicit in
> WordPress.
>
> I think there are two ways to go about it which don't straitjacket the
> developers and provide a reasonable set of advantages:
>
> 1) just package the raw SQL queries in function calls like this:
>
> $wp_db->select($columns,$conditions)
> e.g.
> $wp_db->select(array('column1','column2'),array('id' => $someid))
>
> This is already being used for insert and update. Moving all the
> INSERTs, UPDATEs and the simple cases of SELECT, DELETE etc. to the
> abstraction layer already rids us of 200 uses of raw SQL. And it's
> really not much work. (these are the 'method_exists' and
> 'trivial_implementation' cases). For the rest, we could see where we
> would go from there. If it works out well for the simple cases, we
> might also consider the ones that actually need some code to be
> written...
>
> Matt, Jacob is this something you would approve?
>
> I'll send proof-of-concept patches for this types later today...
>
> 2) Create "intentional" abstractions (This is an idea that goes much
> further and moves SQL out of the picture completely; I see that there
> are people that won't like this approach; it's just an idea).
>
> The basic point of these abstractions is that the caller just says
> what he wants and does not really care how the callee gets the job
> done.
>
> $wp_db->getPosts()
> $wp_db->getCommentsForPost($id)
> $wp_db->getPostsByTag($tag)
>
> The problem with this approach is that there could be too many
> functions needed because you would need one for every particular type
> of query. You could unify some query types using keyword arguments
> (like adding 'limit' => 10) but that could hurt readability.
>
> Florian
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>   



More information about the wp-hackers mailing list