<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[32388] branches/4.0: WPDB: When checking that a string can be sent to MySQL, we shouldn't use `mb_convert_encoding()`, as it behaves differently to MySQL's character encoding conversion.</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/32388">32388</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/32388","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>mdawaffe</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2015-05-06 19:08:42 +0000 (Wed, 06 May 2015)</dd>
</dl>

<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>WPDB: When checking that a string can be sent to MySQL, we shouldn't use `mb_convert_encoding()`, as it behaves differently to MySQL's character encoding conversion.

Merge of <a href="https://core.trac.wordpress.org/changeset/32364">[32364]</a> to the 4.0 branch.

Props mdawaffe, pento, nbachiyski, jorbin, johnjamesjacoby, jeremyfelt.

See <a href="https://core.trac.wordpress.org/ticket/32165">#32165</a>.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#branches40srcwpadminincludesupgradephp">branches/4.0/src/wp-admin/includes/upgrade.php</a></li>
<li><a href="#branches40srcwpincludescompatphp">branches/4.0/src/wp-includes/compat.php</a></li>
<li><a href="#branches40srcwpincludesversionphp">branches/4.0/src/wp-includes/version.php</a></li>
<li><a href="#branches40srcwpincludeswpdbphp">branches/4.0/src/wp-includes/wp-db.php</a></li>
<li><a href="#branches40testsphpunittestscommentphp">branches/4.0/tests/phpunit/tests/comment.php</a></li>
<li><a href="#branches40testsphpunittestscompatphp">branches/4.0/tests/phpunit/tests/compat.php</a></li>
<li><a href="#branches40testsphpunittestsdbcharsetphp">branches/4.0/tests/phpunit/tests/db/charset.php</a></li>
<li><a href="#branches40testsphpunittestsdbphp">branches/4.0/tests/phpunit/tests/db.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="branches40srcwpadminincludesupgradephp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/src/wp-admin/includes/upgrade.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/src/wp-admin/includes/upgrade.php    2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/src/wp-admin/includes/upgrade.php      2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -440,8 +440,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">        if ( $wp_current_db_version < 29630 )
</span><span class="cx" style="display: block; padding: 0 10px">                upgrade_400();
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        if ( $wp_current_db_version < 29631 )
-               upgrade_404();
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( $wp_current_db_version < 29632 )
+               upgrade_405();
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        maybe_disable_link_manager();
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1335,19 +1335,43 @@
</span><span class="cx" style="display: block; padding: 0 10px">  * @since 4.0.4
</span><span class="cx" style="display: block; padding: 0 10px">  */
</span><span class="cx" style="display: block; padding: 0 10px"> function upgrade_404() {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+}
+
+/**
+ * Execute changes made in WordPress 4.0.5.
+ *
+ * @since 4.0.5
+ */
+function upgrade_405() {
</ins><span class="cx" style="display: block; padding: 0 10px">         global $wp_current_db_version, $wpdb;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        if ( $wp_current_db_version < 29631 ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( $wp_current_db_version < 29632 ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                 $content_length = $wpdb->get_col_length( $wpdb->comments, 'comment_content' );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( ! $content_length ) {
-                       $content_length = 65535;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( false === $content_length ) {
+                       $content_length = array(
+                               'type'   => 'byte',
+                               'length' => 65535,
+                       );
+               } elseif ( ! is_array( $content_length ) ) {
+                       $length = (int) $content_length > 0 ? (int) $content_length : 65535;
+                       $content_length = array(
+                               'type'   => 'byte',
+                               'length' => $length
+                       );
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                if ( 'byte' !== $content_length['type'] ) {
+                       // Sites with malformed DB schemas are on their own.
+                       return;
+               }
+
+               $allowed_length = intval( $content_length['length'] ) - 10;
+
</ins><span class="cx" style="display: block; padding: 0 10px">                 $comments = $wpdb->get_results(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        "SELECT comment_ID FROM $wpdb->comments
-                       WHERE comment_date_gmt > '2015-04-26'
-                       AND CHAR_LENGTH( comment_content ) >= $content_length
-                       AND ( comment_content LIKE '%<%' OR comment_content LIKE '%>%' )"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 "SELECT `comment_ID` FROM `{$wpdb->comments}`
+                               WHERE `comment_date_gmt` > '2015-04-26'
+                               AND LENGTH( `comment_content` ) >= {$allowed_length}
+                               AND ( `comment_content` LIKE '%<%' OR `comment_content` LIKE '%>%' )"
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                foreach ( $comments as $comment ) {
</span></span></pre></div>
<a id="branches40srcwpincludescompatphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/src/wp-includes/compat.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/src/wp-includes/compat.php   2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/src/wp-includes/compat.php     2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -13,25 +13,143 @@
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-if ( !function_exists('mb_substr') ):
-       function mb_substr( $str, $start, $length=null, $encoding=null ) {
-               return _mb_substr($str, $start, $length, $encoding);
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+/**
+ * Returns whether PCRE/u (PCRE_UTF8 modifier) is available for use.
+ *
+ * @ignore
+ * @since 4.2.2
+ * @access private
+ *
+ * @param bool $set - Used for testing only
+ *             null   : default - get PCRE/u capability
+ *             false  : Used for testing - return false for future calls to this function
+ *             'reset': Used for testing - restore default behavior of this function
+ */
+function _wp_can_use_pcre_u( $set = null ) {
+       static $utf8_pcre = 'reset';
+
+       if ( null !== $set ) {
+               $utf8_pcre = $set;
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+       if ( 'reset' === $utf8_pcre ) {
+               $utf8_pcre = @preg_match( '/^./u', 'a' );
+       }
+
+       return $utf8_pcre;
+}
+
+if ( ! function_exists( 'mb_substr' ) ) :
+       function mb_substr( $str, $start, $length = null, $encoding = null ) {
+               return _mb_substr( $str, $start, $length, $encoding );
+       }
</ins><span class="cx" style="display: block; padding: 0 10px"> endif;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-function _mb_substr( $str, $start, $length=null, $encoding=null ) {
-       // the solution below, works only for utf-8, so in case of a different
-       // charset, just use built-in substr
-       $charset = get_option( 'blog_charset' );
-       if ( !in_array( $charset, array('utf8', 'utf-8', 'UTF8', 'UTF-8') ) ) {
-               return is_null( $length )? substr( $str, $start ) : substr( $str, $start, $length);
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+/*
+ * Only understands UTF-8 and 8bit.  All other character sets will be treated as 8bit.
+ * For $encoding === UTF-8, the $str input is expected to be a valid UTF-8 byte sequence.
+ * The behavior of this function for invalid inputs is undefined.
+ */
+function _mb_substr( $str, $start, $length = null, $encoding = null ) {
+       if ( null === $encoding ) {
+               $encoding = get_option( 'blog_charset' );
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        // use the regex unicode support to separate the UTF-8 characters into an array
-       preg_match_all( '/./us', $str, $match );
-       $chars = is_null( $length )? array_slice( $match[0], $start ) : array_slice( $match[0], $start, $length );
-       return implode( '', $chars );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+       // The solution below works only for UTF-8,
+       // so in case of a different charset just use built-in substr()
+       if ( ! in_array( $encoding, array( 'utf8', 'utf-8', 'UTF8', 'UTF-8' ) ) ) {
+               return is_null( $length ) ? substr( $str, $start ) : substr( $str, $start, $length );
+       }
+
+       if ( _wp_can_use_pcre_u() ) {
+               // Use the regex unicode support to separate the UTF-8 characters into an array
+               preg_match_all( '/./us', $str, $match );
+               $chars = is_null( $length ) ? array_slice( $match[0], $start ) : array_slice( $match[0], $start, $length );
+               return implode( '', $chars );
+       }
+
+       $regex = '/(
+                 [\x00-\x7F]                  # single-byte sequences   0xxxxxxx
+               | [\xC2-\xDF][\x80-\xBF]       # double-byte sequences   110xxxxx 10xxxxxx
+               | \xE0[\xA0-\xBF][\x80-\xBF]   # triple-byte sequences   1110xxxx 10xxxxxx * 2
+               | [\xE1-\xEC][\x80-\xBF]{2}
+               | \xED[\x80-\x9F][\x80-\xBF]
+               | [\xEE-\xEF][\x80-\xBF]{2}
+               | \xF0[\x90-\xBF][\x80-\xBF]{2} # four-byte sequences   11110xxx 10xxxxxx * 3
+               | [\xF1-\xF3][\x80-\xBF]{3}
+               | \xF4[\x80-\x8F][\x80-\xBF]{2}
+       )/x';
+
+       $chars = array( '' ); // Start with 1 element instead of 0 since the first thing we do is pop
+       do {
+               // We had some string left over from the last round, but we counted it in that last round.
+               array_pop( $chars );
+
+               // Split by UTF-8 character, limit to 1000 characters (last array element will contain the rest of the string)
+               $pieces = preg_split( $regex, $str, 1000, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY );
+
+               $chars = array_merge( $chars, $pieces );
+       } while ( count( $pieces ) > 1 && $str = array_pop( $pieces ) ); // If there's anything left over, repeat the loop.
+
+       return join( '', array_slice( $chars, $start, $length ) );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+if ( ! function_exists( 'mb_strlen' ) ) :
+       function mb_strlen( $str, $encoding = null ) {
+               return _mb_strlen( $str, $encoding );
+       }
+endif;
+
+/*
+ * Only understands UTF-8 and 8bit.  All other character sets will be treated as 8bit.
+ * For $encoding === UTF-8, the $str input is expected to be a valid UTF-8 byte sequence.
+ * The behavior of this function for invalid inputs is undefined.
+ */
+function _mb_strlen( $str, $encoding = null ) {
+       if ( null === $encoding ) {
+               $encoding = get_option( 'blog_charset' );
+       }
+
+       // The solution below works only for UTF-8,
+       // so in case of a different charset just use built-in strlen()
+       if ( ! in_array( $encoding, array( 'utf8', 'utf-8', 'UTF8', 'UTF-8' ) ) ) {
+               return strlen( $str );
+       }
+
+       if ( _wp_can_use_pcre_u() ) {
+               // Use the regex unicode support to separate the UTF-8 characters into an array
+               preg_match_all( '/./us', $str, $match );
+               return count( $match[0] );
+       }
+
+       $regex = '/(?:
+                 [\x00-\x7F]                  # single-byte sequences   0xxxxxxx
+               | [\xC2-\xDF][\x80-\xBF]       # double-byte sequences   110xxxxx 10xxxxxx
+               | \xE0[\xA0-\xBF][\x80-\xBF]   # triple-byte sequences   1110xxxx 10xxxxxx * 2
+               | [\xE1-\xEC][\x80-\xBF]{2}
+               | \xED[\x80-\x9F][\x80-\xBF]
+               | [\xEE-\xEF][\x80-\xBF]{2}
+               | \xF0[\x90-\xBF][\x80-\xBF]{2} # four-byte sequences   11110xxx 10xxxxxx * 3
+               | [\xF1-\xF3][\x80-\xBF]{3}
+               | \xF4[\x80-\x8F][\x80-\xBF]{2}
+       )/x';
+
+       $count = 1; // Start at 1 instead of 0 since the first thing we do is decrement
+       do {
+               // We had some string left over from the last round, but we counted it in that last round.
+               $count--;
+
+               // Split by UTF-8 character, limit to 1000 characters (last array element will contain the rest of the string)
+               $pieces = preg_split( $regex, $str, 1000 );
+
+               // Increment
+               $count += count( $pieces );
+       } while ( $str = array_pop( $pieces ) ); // If there's anything left over, repeat the loop.
+
+       // Fencepost: preg_split() always returns one extra item in the array
+       return --$count;
+}
+
</ins><span class="cx" style="display: block; padding: 0 10px"> if ( !function_exists('hash_hmac') ):
</span><span class="cx" style="display: block; padding: 0 10px"> function hash_hmac($algo, $data, $key, $raw_output = false) {
</span><span class="cx" style="display: block; padding: 0 10px">        return _hash_hmac($algo, $data, $key, $raw_output);
</span></span></pre></div>
<a id="branches40srcwpincludesversionphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/src/wp-includes/version.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/src/wp-includes/version.php  2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/src/wp-includes/version.php    2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -11,7 +11,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">  *
</span><span class="cx" style="display: block; padding: 0 10px">  * @global int $wp_db_version
</span><span class="cx" style="display: block; padding: 0 10px">  */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-$wp_db_version = 29631;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+$wp_db_version = 29632;
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px">  * Holds the TinyMCE version
</span></span></pre></div>
<a id="branches40srcwpincludeswpdbphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/src/wp-includes/wp-db.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/src/wp-includes/wp-db.php    2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/src/wp-includes/wp-db.php      2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1787,6 +1787,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">         * @return int|false The number of rows affected, or false on error.
</span><span class="cx" style="display: block; padding: 0 10px">         */
</span><span class="cx" style="display: block; padding: 0 10px">        function _insert_replace_helper( $table, $data, $format = null, $type = 'INSERT' ) {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                $this->insert_id = 0;
+
</ins><span class="cx" style="display: block; padding: 0 10px">                 if ( ! in_array( strtoupper( $type ), array( 'REPLACE', 'INSERT' ) ) ) {
</span><span class="cx" style="display: block; padding: 0 10px">                        return false;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1807,7 +1809,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $sql = "$type INTO `$table` ($fields) VALUES ($formats)";
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                $this->insert_id = 0;
</del><span class="cx" style="display: block; padding: 0 10px">                 $this->check_current_query = false;
</span><span class="cx" style="display: block; padding: 0 10px">                return $this->query( $this->prepare( $sql, $values ) );
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2003,17 +2004,11 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                // We can skip this field if we know it isn't a string.
</span><span class="cx" style="display: block; padding: 0 10px">                                // This checks %d/%f versus ! %s because it's sprintf() could take more.
</span><span class="cx" style="display: block; padding: 0 10px">                                $value['charset'] = false;
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        } elseif ( $this->check_ascii( $value['value'] ) ) {
-                               // If it's ASCII, then we don't need the charset. We can skip this field.
-                               $value['charset'] = false;
</del><span class="cx" style="display: block; padding: 0 10px">                         } else {
</span><span class="cx" style="display: block; padding: 0 10px">                                $value['charset'] = $this->get_col_charset( $table, $field );
</span><span class="cx" style="display: block; padding: 0 10px">                                if ( is_wp_error( $value['charset'] ) ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                        return false;
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-
-                               // This isn't ASCII. Don't have strip_invalid_text() re-check.
-                               $value['ascii'] = false;
</del><span class="cx" style="display: block; padding: 0 10px">                         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        $data[ $field ] = $value;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2046,10 +2041,6 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        if ( false !== $value['length'] && strlen( $value['value'] ) > $value['length'] ) {
-                               return false;
-                       }
-
</del><span class="cx" style="display: block; padding: 0 10px">                         $data[ $field ] = $value;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2379,14 +2370,16 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        /**
</span><span class="cx" style="display: block; padding: 0 10px">         * Retrieve the maximum string length allowed in a given column.
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         * The length may either be specified as a byte length or a character length.
</ins><span class="cx" style="display: block; padding: 0 10px">          *
</span><span class="cx" style="display: block; padding: 0 10px">         * @since 4.2.1
</span><span class="cx" style="display: block; padding: 0 10px">         * @access public
</span><span class="cx" style="display: block; padding: 0 10px">         *
</span><span class="cx" style="display: block; padding: 0 10px">         * @param string $table  Table name.
</span><span class="cx" style="display: block; padding: 0 10px">         * @param string $column Column name.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-         * @return mixed Max column length as an int. False if the column has no
-        *               length. WP_Error object if there was an error.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+  * @return mixed array( 'length' => (int), 'type' => 'byte' | 'char' )
+        *               false if the column has no length (for example, numeric column)
+        *               WP_Error object if there was an error.
</ins><span class="cx" style="display: block; padding: 0 10px">          */
</span><span class="cx" style="display: block; padding: 0 10px">        public function get_col_length( $table, $column ) {
</span><span class="cx" style="display: block; padding: 0 10px">                $tablekey = strtolower( $table );
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2419,27 +2412,47 @@
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                switch( $type ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        case 'binary':
</del><span class="cx" style="display: block; padding: 0 10px">                         case 'char':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        case 'varbinary':
</del><span class="cx" style="display: block; padding: 0 10px">                         case 'varchar':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                return $length;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         return array(
+                                       'type'   => 'char',
+                                       'length' => (int) $length,
+                               );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        case 'binary':
+                       case 'varbinary':
+                               return array(
+                                       'type'   => 'byte',
+                                       'length' => (int) $length,
+                               );
+                               break;
</ins><span class="cx" style="display: block; padding: 0 10px">                         case 'tinyblob':
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'tinytext':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                return 255; // 2^8 - 1
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         return array(
+                                       'type'   => 'byte',
+                                       'length' => 255,        // 2^8 - 1
+                               );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'blob':
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'text':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                return 65535; // 2^16 - 1
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         return array(
+                                       'type'   => 'byte',
+                                       'length' => 65535,      // 2^16 - 1
+                               );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'mediumblob':
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'mediumtext':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                return 16777215; // 2^24 - 1
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         return array(
+                                       'type'   => 'byte',
+                                       'length' => 16777215,   // 2^24 - 1
+                               );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'longblob':
</span><span class="cx" style="display: block; padding: 0 10px">                        case 'longtext':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                return 4294967295; // 2^32 - 1
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         return array(
+                                       'type'   => 'byte',
+                                       'length' => 4294967295, // 2^32 - 1
+                               );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><span class="cx" style="display: block; padding: 0 10px">                        default:
</span><span class="cx" style="display: block; padding: 0 10px">                                return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2546,50 +2559,55 @@
</span><span class="cx" style="display: block; padding: 0 10px">         */
</span><span class="cx" style="display: block; padding: 0 10px">                // If any of the columns don't have one of these collations, it needs more sanity checking.
</span><span class="cx" style="display: block; padding: 0 10px">        protected function strip_invalid_text( $data ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                // Some multibyte character sets that we can check in PHP.
-               $mb_charsets = array(
-                       'ascii'   => 'ASCII',
-                       'big5'    => 'BIG-5',
-                       'eucjpms' => 'eucJP-win',
-                       'gb2312'  => 'EUC-CN',
-                       'ujis'    => 'EUC-JP',
-                       'utf32'   => 'UTF-32',
-               );
-
-               $supported_charsets = array();
-               if ( function_exists( 'mb_list_encodings' ) ) {
-                       $supported_charsets = mb_list_encodings();
-               }
-
</del><span class="cx" style="display: block; padding: 0 10px">                 $db_check_string = false;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                foreach ( $data as &$value ) {
</span><span class="cx" style="display: block; padding: 0 10px">                        $charset = $value['charset'];
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        // Column isn't a string, or is latin1, which will will happily store anything.
-                       if ( false === $charset || 'latin1' === $charset ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 if ( is_array( $value['length'] ) ) {
+                               $length = $value['length']['length'];
+                       } else {
+                               $length = false;
+                       }
+
+                       // There's no charset to work with.
+                       if ( false === $charset ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                 continue;
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        // Column isn't a string.
</ins><span class="cx" style="display: block; padding: 0 10px">                         if ( ! is_string( $value['value'] ) ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                continue;
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        // ASCII is always OK.
-                       if ( ! isset( $value['ascii'] ) && $this->check_ascii( $value['value'] ) ) {
-                               continue;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 $truncate_by_byte_length = 'byte' === $value['length']['type'];
+
+                       $needs_validation = true;
+                       if (
+                               // latin1 can store any byte sequence
+                               'latin1' === $charset
+                       ||
+                               // ASCII is always OK.
+                               ( ! isset( $value['ascii'] ) && $this->check_ascii( $value['value'] ) )
+                       ) {
+                               $truncate_by_byte_length = true;
+                               $needs_validation = false;
</ins><span class="cx" style="display: block; padding: 0 10px">                         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        // Convert the text locally.
-                       if ( $supported_charsets ) {
-                               if ( isset( $mb_charsets[ $charset ] ) && in_array( $mb_charsets[ $charset ], $supported_charsets ) ) {
-                                       $value['value'] = mb_convert_encoding( $value['value'], $mb_charsets[ $charset ], $mb_charsets[ $charset ] );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 if ( $truncate_by_byte_length ) {
+                               mbstring_binary_safe_encoding();
+                               if ( false !== $length && strlen( $value['value'] ) > $length ) {
+                                       $value['value'] = substr( $value['value'], 0, $length );
+                               }
+                               reset_mbstring_encoding();
+
+                               if ( ! $needs_validation ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                         continue;
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        // utf8 can be handled by regex, which is a bunch faster than a DB lookup.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        if ( 'utf8' === $charset || 'utf8mb3' === $charset || 'utf8mb4' === $charset ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 if ( ( 'utf8' === $charset || 'utf8mb3' === $charset || 'utf8mb4' === $charset ) && function_exists( 'mb_strlen' ) ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                 $regex = '/
</span><span class="cx" style="display: block; padding: 0 10px">                                        (
</span><span class="cx" style="display: block; padding: 0 10px">                                                (?: [\x00-\x7F]                  # single-byte sequences   0xxxxxxx
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2599,7 +2617,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                                |   \xED[\x80-\x9F][\x80-\xBF]
</span><span class="cx" style="display: block; padding: 0 10px">                                                |   [\xEE-\xEF][\x80-\xBF]{2}';
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                if ( 'utf8mb4' === $charset) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         if ( 'utf8mb4' === $charset ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                         $regex .= '
</span><span class="cx" style="display: block; padding: 0 10px">                                                |    \xF0[\x90-\xBF][\x80-\xBF]{2} # four-byte sequences   11110xxx 10xxxxxx * 3
</span><span class="cx" style="display: block; padding: 0 10px">                                                |    [\xF1-\xF3][\x80-\xBF]{3}
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2612,6 +2630,11 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                        | .                                  # anything else
</span><span class="cx" style="display: block; padding: 0 10px">                                        /x';
</span><span class="cx" style="display: block; padding: 0 10px">                                $value['value'] = preg_replace( $regex, '$1', $value['value'] );
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+
+                               if ( false !== $length && mb_strlen( $value['value'], 'UTF-8' ) > $length ) {
+                                       $value['value'] = mb_substr( $value['value'], 0, $length, 'UTF-8' );
+                               }
</ins><span class="cx" style="display: block; padding: 0 10px">                                 continue;
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2628,8 +2651,14 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                                $queries[ $value['charset'] ] = array();
</span><span class="cx" style="display: block; padding: 0 10px">                                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                        // Split the CONVERT() calls by charset, so we can make sure the connection is right
-                                       $queries[ $value['charset'] ][ $col ] = $this->prepare( "CONVERT( %s USING {$value['charset']} )", $value['value'] );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                 // We're going to need to truncate by characters or bytes, depending on the length value we have.
+                                       if ( 'byte' === $value['length']['type'] ) {
+                                               // Split the CONVERT() calls by charset, so we can make sure the connection is right
+                                               $queries[ $value['charset'] ][ $col ] = $this->prepare( "CONVERT( LEFT( CONVERT( %s USING binary ), %d ) USING {$value['charset']} )", $value['value'], $value['length']['length'] );
+                                       } else {
+                                               $queries[ $value['charset'] ][ $col ] = $this->prepare( "LEFT( CONVERT( %s USING {$value['charset']} ), %d )", $value['value'], $value['length']['length'] );
+                                       }
+
</ins><span class="cx" style="display: block; padding: 0 10px">                                         unset( $data[ $col ]['db'] );
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2648,16 +2677,19 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                                $this->check_current_query = false;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                $row = $this->get_row( "SELECT " . implode( ', ', $query ), ARRAY_N );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         $sql = array();
+                               foreach ( $query as $column => $column_query ) {
+                                       $sql[] = $column_query . " AS x_$column";
+                               }
+
+                               $row = $this->get_row( "SELECT " . implode( ', ', $sql ), ARRAY_A );
</ins><span class="cx" style="display: block; padding: 0 10px">                                 if ( ! $row ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                        $this->set_charset( $this->dbh, $connection_charset );
</span><span class="cx" style="display: block; padding: 0 10px">                                        return new WP_Error( 'wpdb_strip_invalid_text_failure' );
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                $cols = array_keys( $query );
-                               $col_count = count( $cols );
-                               for ( $ii = 0; $ii < $col_count; $ii++ ) {
-                                       $data[ $cols[ $ii ] ]['value'] = $row[ $ii ];
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         foreach ( array_keys( $query ) as $column ) {
+                                       $data[ $column ]['value'] = $row["x_$column"];
</ins><span class="cx" style="display: block; padding: 0 10px">                                 }
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2699,6 +2731,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        'value'   => $query,
</span><span class="cx" style="display: block; padding: 0 10px">                        'charset' => $charset,
</span><span class="cx" style="display: block; padding: 0 10px">                        'ascii'   => false,
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'length'  => false,
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $data = $this->strip_invalid_text( array( $data ) );
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2721,7 +2754,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">         * @return string|WP_Error The converted string, or a `WP_Error` object if the conversion fails.
</span><span class="cx" style="display: block; padding: 0 10px">         */
</span><span class="cx" style="display: block; padding: 0 10px">        public function strip_invalid_text_for_column( $table, $column, $value ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( ! is_string( $value ) || $this->check_ascii( $value ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( ! is_string( $value ) ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         return $value;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2738,7 +2771,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        $column => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'   => $value,
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset' => $charset,
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'ascii'   => false,
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'length'  => $this->get_col_length( $table, $column ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         )
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span></span></pre></div>
<a id="branches40testsphpunittestscommentphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/tests/phpunit/tests/comment.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/tests/phpunit/tests/comment.php      2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/tests/phpunit/tests/comment.php        2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -23,7 +23,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        $_SERVER['REMOTE_ADDR'] = '';
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                $post_id = $this->factory->post->create();
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         $u = $this->factory->user->create();
+               $post_id = $this->factory->post->create( array( 'post_author' => $u ) );
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $data = array(
</span><span class="cx" style="display: block; padding: 0 10px">                        'comment_post_ID' => $post_id,
</span></span></pre></div>
<a id="branches40testsphpunittestscompatphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/tests/phpunit/tests/compat.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/tests/phpunit/tests/compat.php       2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/tests/phpunit/tests/compat.php 2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2,15 +2,168 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px">  * @group compat
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @group security-153
</ins><span class="cx" style="display: block; padding: 0 10px">  */
</span><span class="cx" style="display: block; padding: 0 10px"> class Tests_Compat extends WP_UnitTestCase {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        function test_mb_substr() {
-               $this->assertEquals('баб', _mb_substr('баба', 0, 3));
-               $this->assertEquals('баб', _mb_substr('баба', 0, -1));
-               $this->assertEquals('баб', _mb_substr('баба', 0, -1));
-               $this->assertEquals('I am your б', _mb_substr('I am your баба', 0, 11));
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ function utf8_string_lengths() {
+               return array(
+                       //                     string, character_length, byte_length
+                       array(                 'баба',                4,           8 ),
+                       array(                  'баб',                3,           6 ),
+                       array(          'I am your б',               11,          12 ),
+                       array(           '1111111111',               10,          10 ),
+                       array(           '²²²²²²²²²²',               10,          20 ),
+                       array( '3333333333',               10,          30 ),
+                       array(           '𝟜𝟜𝟜𝟜𝟜𝟜𝟜𝟜𝟜𝟜',               10,          40 ),
+                       array(      '1²3𝟜1²3𝟜1²3𝟜',               12,          30 ),
+               );
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+        function utf8_substrings() {
+               return array(
+                       //               string, start, length, character_substring,   byte_substring
+                       array(           'баба',     0,      3,               'баб',          "б\xD0" ),
+                       array(           'баба',     0,     -1,               'баб',        "баб\xD0" ),
+                       array(           'баба',     1,   null,               'аба',        "\xB1аба" ),
+                       array(           'баба',    -3,   null,               'аба',          "\xB1а" ),
+                       array(           'баба',    -3,      2,                'аб',       "\xB1\xD0" ),
+                       array(           'баба',    -1,      2,                 'а',           "\xB0" ),
+                       array( 'I am your баба',     0,     11,       'I am your б', "I am your \xD0" ),
+               );
+       }
+
+       /**
+        * @dataProvider utf8_string_lengths
+        */
+       function test_mb_strlen( $string, $expected_character_length ) {
+               $this->assertEquals( $expected_character_length, _mb_strlen( $string, 'UTF-8' ) );
+       }
+
+       /**
+        * @dataProvider utf8_string_lengths
+        */
+       function test_mb_strlen_via_regex( $string, $expected_character_length ) {
+               _wp_can_use_pcre_u( false );
+               $this->assertEquals( $expected_character_length, _mb_strlen( $string, 'UTF-8' ) );
+               _wp_can_use_pcre_u( 'reset' );
+       }
+
+       /**
+        * @dataProvider utf8_string_lengths
+        */
+       function test_8bit_mb_strlen( $string, $expected_character_length, $expected_byte_length ) {
+               $this->assertEquals( $expected_byte_length, _mb_strlen( $string, '8bit' ) );
+       }
+
+       /**
+        * @dataProvider utf8_substrings
+        */
+       function test_mb_substr( $string, $start, $length, $expected_character_substring ) {
+               $this->assertEquals( $expected_character_substring, _mb_substr( $string, $start, $length, 'UTF-8' ) );
+       }
+
+       /**
+        * @dataProvider utf8_substrings
+        */
+       function test_mb_substr_via_regex( $string, $start, $length, $expected_character_substring ) {
+               _wp_can_use_pcre_u( false );
+               $this->assertEquals( $expected_character_substring, _mb_substr( $string, $start, $length, 'UTF-8' ) );
+               _wp_can_use_pcre_u( 'reset' );
+       }
+
+       /**
+        * @dataProvider utf8_substrings
+        */
+       function test_8bit_mb_substr( $string, $start, $length, $expected_character_substring, $expected_byte_substring ) {
+               $this->assertEquals( $expected_byte_substring, _mb_substr( $string, $start, $length, '8bit' ) );
+       }
+
+       function test_mb_substr_phpcore(){
+               /* https://github.com/php/php-src/blob/php-5.6.8/ext/mbstring/tests/mb_substr_basic.phpt */
+               $string_ascii = 'ABCDEF';
+               $string_mb = base64_decode('5pel5pys6Kqe44OG44Kt44K544OI44Gn44GZ44CCMDEyMzTvvJXvvJbvvJfvvJjvvJnjgII=');
+
+               $this->assertEquals( 'DEF', _mb_substr($string_ascii, 3) );
+               $this->assertEquals( 'DEF', _mb_substr($string_ascii, 3, 5, 'ISO-8859-1') );
+
+               // specific latin-1 as that is the default the core php test opporates under    
+               $this->assertEquals( 'peacrOiqng==' , base64_encode( _mb_substr($string_mb, 2, 7, 'latin-1' ) ) );
+               $this->assertEquals( '6Kqe44OG44Kt44K544OI44Gn44GZ', base64_encode( _mb_substr($string_mb, 2, 7, 'utf-8') ) );
+
+               /* https://github.com/php/php-src/blob/php-5.6.8/ext/mbstring/tests/mb_substr_variation1.phpt */
+               $start = 0;
+               $length = 5;
+               $unset_var = 10;
+               unset ($unset_var);
+               $heredoc = <<<EOT
+hello world
+EOT;
+               $inputs = array( 
+               /*1*/  0,
+                          1,
+                          12345,
+                          -2345,
+                          // float data
+               /*5*/  10.5,
+                          -10.5,
+                          12.3456789000e10,
+                          12.3456789000E-10,
+                          .5,
+                          // null data
+               /*10*/ NULL,
+                          null,
+                          // boolean data
+               /*12*/ true,
+                          false,
+                          TRUE,
+                          FALSE,
+                          // empty data
+               /*16*/ "",
+                          '',
+                          // string data
+               /*18*/ "string",
+                          'string',
+                          $heredoc,
+                          // object data
+               /*21*/ new classA(),
+                          // undefined data
+               /*22*/ @$undefined_var,
+                          // unset data
+               /*23*/ @$unset_var,
+               );
+               $outputs = array(
+                       "0",
+                       "1",
+                       "12345",
+                       "-2345",
+                       "10.5",
+                       "-10.5",
+                       "12345",
+                       "1.234",
+                       "0.5",
+                       "",
+                       "",
+                       "1",
+                       "",
+                       "1",
+                       "",
+                       "",
+                       "",
+                       "strin",
+                       "strin",
+                       "hello",
+                       "Class",
+                       "",
+                       "",
+               );
+               $iterator = 0;
+               foreach($inputs as $input) {
+                       $this->assertEquals( $outputs[$iterator] ,  _mb_substr($input, $start, $length) );
+                       $iterator++;
+               }
+
+       }
+
</ins><span class="cx" style="display: block; padding: 0 10px">         function test_hash_hmac_simple() {
</span><span class="cx" style="display: block; padding: 0 10px">                $this->assertEquals('140d1cb79fa12e2a31f32d35ad0a2723', _hash_hmac('md5', 'simple', 'key'));
</span><span class="cx" style="display: block; padding: 0 10px">                $this->assertEquals('993003b95758e0ac2eba451a4c5877eb1bb7b92a', _hash_hmac('sha1', 'simple', 'key'));
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -34,3 +187,10 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $this->assertEquals( array( 'foo' ), $json->decode( '["foo"]' ) );
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+/* used in test_mb_substr_phpcore */ 
+class classA {
+       public function __toString() {
+               return "Class A object";
+       }
+}
</ins></span></pre></div>
<a id="branches40testsphpunittestsdbcharsetphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/tests/phpunit/tests/db/charset.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/tests/phpunit/tests/db/charset.php   2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/tests/phpunit/tests/db/charset.php     2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -6,6 +6,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">  * Test WPDB methods
</span><span class="cx" style="display: block; padding: 0 10px">  *
</span><span class="cx" style="display: block; padding: 0 10px">  * @group wpdb
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @group security-153
</ins><span class="cx" style="display: block; padding: 0 10px">  */
</span><span class="cx" style="display: block; padding: 0 10px"> class Tests_DB_Charset extends WP_UnitTestCase {
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -28,57 +29,227 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                // latin1. latin1 never changes.
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'latin1',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "\xf0\x9f\x8e\xb7",
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => "\xf0\x9f\x8e\xb7"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => "\xf0\x9f\x8e\xb7",
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'latin1_char_length' => array(
+                               // latin1. latin1 never changes.
+                               'charset'  => 'latin1',
+                               'value'    => str_repeat( 'A', 11 ),
+                               'expected' => str_repeat( 'A', 10 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'latin1_byte_length' => array(
+                               // latin1. latin1 never changes.
+                               'charset'  => 'latin1',
+                               'value'    => str_repeat( 'A', 11 ),
+                               'expected' => str_repeat( 'A', 10 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'ascii' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                // ascii gets special treatment, make sure it's covered
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'ascii',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => 'Hello World',
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => 'Hello World'
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => 'Hello World',
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'ascii_char_length' => array(
+                               // ascii gets special treatment, make sure it's covered
+                               'charset'  => 'ascii',
+                               'value'    => str_repeat( 'A', 11 ),
+                               'expected' => str_repeat( 'A', 10 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'ascii_byte_length' => array(
+                               // ascii gets special treatment, make sure it's covered
+                               'charset'  => 'ascii',
+                               'value'    => str_repeat( 'A', 11 ),
+                               'expected' => str_repeat( 'A', 10 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'utf8' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                // utf8 only allows <= 3-byte chars
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'utf8',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "H€llo\xf0\x9f\x98\x88World¢",
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => 'H€lloWorld¢'
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => 'H€lloWorld¢',
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'utf8_23char_length' => array(
+                               // utf8 only allows <= 3-byte chars
+                               'charset'  => 'utf8',
+                               'value'    => str_repeat( "²3", 10 ),
+                               'expected' => str_repeat( "²3", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8_23byte_length' => array(
+                               // utf8 only allows <= 3-byte chars
+                               'charset'  => 'utf8',
+                               'value'    => str_repeat( "²3", 10 ),
+                               'expected' => "²3²3",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
+                       'utf8_3char_length' => array(
+                               // utf8 only allows <= 3-byte chars
+                               'charset'  => 'utf8',
+                               'value'    => str_repeat( "3", 11 ),
+                               'expected' => str_repeat( "3", 10 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8_3byte_length' => array(
+                               // utf8 only allows <= 3-byte chars
+                               'charset'  => 'utf8',
+                               'value'    => str_repeat( "3", 11 ),
+                               'expected' => "333",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'utf8mb3' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                // utf8mb3 should behave the same an utf8
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'utf8mb3',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "H€llo\xf0\x9f\x98\x88World¢",
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => 'H€lloWorld¢'
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => 'H€lloWorld¢',
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'utf8mb3_23char_length' => array(
+                               // utf8mb3 should behave the same an utf8
+                               'charset'  => 'utf8mb3',
+                               'value'    => str_repeat( "²3", 10 ),
+                               'expected' => str_repeat( "²3", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8mb3_23byte_length' => array(
+                               // utf8mb3 should behave the same an utf8
+                               'charset'  => 'utf8mb3',
+                               'value'    => str_repeat( "²3", 10 ),
+                               'expected' => "²3²3",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
+                       'utf8mb3_3char_length' => array(
+                               // utf8mb3 should behave the same an utf8
+                               'charset'  => 'utf8mb3',
+                               'value'    => str_repeat( "3", 11 ),
+                               'expected' => str_repeat( "3", 10 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8mb3_3byte_length' => array(
+                               // utf8mb3 should behave the same an utf8
+                               'charset'  => 'utf8mb3',
+                               'value'    => str_repeat( "3", 10 ),
+                               'expected' => "333",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'utf8mb4' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                // utf8mb4 allows 4-byte characters, too
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'utf8mb4',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "H€llo\xf0\x9f\x98\x88World¢",
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => "H€llo\xf0\x9f\x98\x88World¢"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => "H€llo\xf0\x9f\x98\x88World¢",
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'utf8mb4_234char_length' => array(
+                               // utf8mb4 allows 4-byte characters, too
+                               'charset'  => 'utf8mb4',
+                               'value'    => str_repeat( "²3𝟜", 10 ),
+                               'expected' => "²3𝟜²3𝟜²3𝟜²",
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8mb4_234byte_length' => array(
+                               // utf8mb4 allows 4-byte characters, too
+                               'charset'  => 'utf8mb4',
+                               'value'    => str_repeat( "²3𝟜", 10 ),
+                               'expected' => "²3𝟜",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
+                       'utf8mb4_4char_length' => array(
+                               // utf8mb4 allows 4-byte characters, too
+                               'charset'  => 'utf8mb4',
+                               'value'    => str_repeat( "𝟜", 11 ),
+                               'expected' => str_repeat( "𝟜", 10 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'utf8mb4_4byte_length' => array(
+                               // utf8mb4 allows 4-byte characters, too
+                               'charset'  => 'utf8mb4',
+                               'value'    => str_repeat( "𝟜", 10 ),
+                               'expected' => "𝟜𝟜",
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'koi8r' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'koi8r',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "\xfdord\xf2ress",
</span><span class="cx" style="display: block; padding: 0 10px">                                'expected' => "\xfdord\xf2ress",
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'koi8r_char_length' => array(
+                               'charset'  => 'koi8r',
+                               'value'    => str_repeat( "\xfd\xf2", 10 ),
+                               'expected' => str_repeat( "\xfd\xf2", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'koi8r_byte_length' => array(
+                               'charset'  => 'koi8r',
+                               'value'    => str_repeat( "\xfd\xf2", 10 ),
+                               'expected' => str_repeat( "\xfd\xf2", 5 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'hebrew' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'hebrew',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "\xf9ord\xf7ress",
</span><span class="cx" style="display: block; padding: 0 10px">                                'expected' => "\xf9ord\xf7ress",
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'hebrew_char_length' => array(
+                               'charset'  => 'hebrew',
+                               'value'    => str_repeat( "\xf9\xf7", 10 ),
+                               'expected' => str_repeat( "\xf9\xf7", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'hebrew_byte_length' => array(
+                               'charset'  => 'hebrew',
+                               'value'    => str_repeat( "\xf9\xf7", 10 ),
+                               'expected' => str_repeat( "\xf9\xf7", 5 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'cp1251' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'cp1251',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "\xd8ord\xd0ress",
</span><span class="cx" style="display: block; padding: 0 10px">                                'expected' => "\xd8ord\xd0ress",
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'cp1251_char_length' => array(
+                               'charset'  => 'cp1251',
+                               'value'    => str_repeat( "\xd8\xd0", 10 ),
+                               'expected' => str_repeat( "\xd8\xd0", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'cp1251_byte_length' => array(
+                               'charset'  => 'cp1251',
+                               'value'    => str_repeat( "\xd8\xd0", 10 ),
+                               'expected' => str_repeat( "\xd8\xd0", 5 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'tis620' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'tis620',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => "\xccord\xe3ress",
</span><span class="cx" style="display: block; padding: 0 10px">                                'expected' => "\xccord\xe3ress",
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        'tis620_char_length' => array(
+                               'charset'  => 'tis620',
+                               'value'    => str_repeat( "\xcc\xe3", 10 ),
+                               'expected' => str_repeat( "\xcc\xe3", 5 ),
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       ),
+                       'tis620_byte_length' => array(
+                               'charset'  => 'tis620',
+                               'value'    => str_repeat( "\xcc\xe3", 10 ),
+                               'expected' => str_repeat( "\xcc\xe3", 5 ),
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         'false' => array(
</span><span class="cx" style="display: block; padding: 0 10px">                                // false is a column with no character set (ie, a number column)
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => false,
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => 100,
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => 100
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => 100,
+                               'length'   => false,
</ins><span class="cx" style="display: block; padding: 0 10px">                         ),
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -94,8 +265,23 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        $fields['big5'] = array(
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset'  => 'big5',
</span><span class="cx" style="display: block; padding: 0 10px">                                'value'    => $big5,
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'expected' => $big5
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         'expected' => $big5,
+                               'length'   => array( 'type' => 'char', 'length' => 100 ),
</ins><span class="cx" style="display: block; padding: 0 10px">                         );
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+                       $fields['big5_char_length'] = array(
+                               'charset'  => 'big5',
+                               'value'    => str_repeat( $big5, 10 ),
+                               'expected' => str_repeat( $big5, 3 ) . 'a',
+                               'length'   => array( 'type' => 'char', 'length' => 10 ),
+                       );
+
+                       $fields['big5_byte_length'] = array(
+                               'charset'  => 'big5',
+                               'value'    => str_repeat( $big5, 10 ),
+                               'expected' => str_repeat( $big5, 2 ) . 'a',
+                               'length'   => array( 'type' => 'byte', 'length' => 10 ),
+                       );
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // The data above is easy to edit. Now, prepare it for the data provider.
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -166,14 +352,14 @@
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $all_ascii_fields = array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        'post_content' => array( 'value' => 'foo foo foo!', 'format' => '%s', 'charset' => false ),
-                       'post_excerpt' => array( 'value' => 'bar bar bar!', 'format' => '%s', 'charset' => false ),
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 'post_content' => array( 'value' => 'foo foo foo!', 'format' => '%s', 'charset' => $charset ),
+                       'post_excerpt' => array( 'value' => 'bar bar bar!', 'format' => '%s', 'charset' => $charset ),
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // This is the same data used in process_field_charsets_for_nonexistent_table()
</span><span class="cx" style="display: block; padding: 0 10px">                $non_ascii_string_fields = array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        'post_content' => array( 'value' => '¡foo foo foo!', 'format' => '%s', 'charset' => $charset, 'ascii' => false ),
-                       'post_excerpt' => array( 'value' => '¡bar bar bar!', 'format' => '%s', 'charset' => $charset, 'ascii' => false ),
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 'post_content' => array( 'value' => '¡foo foo foo!', 'format' => '%s', 'charset' => $charset ),
+                       'post_excerpt' => array( 'value' => '¡bar bar bar!', 'format' => '%s', 'charset' => $charset ),
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $vars = get_defined_vars();
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -540,4 +726,16 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                self::$_wpdb->query( $drop );
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+       function test_strip_invalid_test_for_column_bails_if_ascii_input_too_long() {
+               global $wpdb;
+
+               // TEXT column
+               $stripped = $wpdb->strip_invalid_text_for_column( $wpdb->comments, 'comment_content', str_repeat( 'A', 65536 ) );
+               $this->assertEquals( 65535, strlen( $stripped ) );
+
+               // VARCHAR column
+               $stripped = $wpdb->strip_invalid_text_for_column( $wpdb->comments, 'comment_agent', str_repeat( 'A', 256 ) );
+               $this->assertEquals( 255, strlen( $stripped ) );
+       }
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="branches40testsphpunittestsdbphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: branches/4.0/tests/phpunit/tests/db.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- branches/4.0/tests/phpunit/tests/db.php   2015-05-06 19:06:02 UTC (rev 32387)
+++ branches/4.0/tests/phpunit/tests/db.php     2015-05-06 19:08:42 UTC (rev 32388)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -711,7 +711,6 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                'value' => '¡foo foo foo!',
</span><span class="cx" style="display: block; padding: 0 10px">                                'format' => '%s',
</span><span class="cx" style="display: block; padding: 0 10px">                                'charset' => $expected_charset,
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                'ascii' => false,
</del><span class="cx" style="display: block; padding: 0 10px">                                 'length' => $wpdb->get_col_length( $wpdb->posts, 'post_content' ),
</span><span class="cx" style="display: block; padding: 0 10px">                        )
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span></span></pre>
</div>
</div>

</body>
</html>