[wp-trac] [WordPress Trac] #11738: sanitize_text_field() issue with UTF-8 characters
WordPress Trac
wp-trac at lists.automattic.com
Wed Jan 6 12:59:10 UTC 2010
#11738: sanitize_text_field() issue with UTF-8 characters
--------------------------+-------------------------------------------------
Reporter: hakre | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: 2.9.2
Component: General | Version: 2.9.1
Severity: normal | Keywords: needs-patch
--------------------------+-------------------------------------------------
Comment(by hakre):
Replying to [comment:1 azaozz]:
> PCRE UTF-8 (the "u" modifier) is not supported everywhere. For reference
see wp_check_invalid_utf8().
> We probably could set a global instead of the static there and use that
for the three functions since they are usually called multiple times.
''wp_check_invalid_utf8()'' has it's own problems, I know. Please see the
PHP documentation I pointed to in this tickets description and Denis did
in a comment right in this ticket regarding a clear statement since which
version the u-modifier is actually available. It is matching with
WordPress current system requirements, so that function can benefit from a
refactoring anyway.
Setting a static and/or global does not help since on each function call
the input might have a different encoding. We have functions that are
working independently from php extenstions like ''seems_utf8()'' for
example. In another patch I offer a fallback save implementation as
''is_valid_utf8()'' that does the job in any case even if the preg
functions do not support any u-modifier. Something the current code is
currently missing. Please
[http://core.trac.wordpress.org/attachment/ticket/5998/5998.2.patch#L275
see that code, look for is_valid_utf8()]. You can find additional
documentation on [http://codex.wordpress.org/User:Hakre/UTF8 my codex page
regarding utf8 and php].
>
> In any case this will need testing in an affected locale/installation.
Afaik the current implementation fails with shift-spaces. In the other
ticket there is the test-case this function needs to cope with, those
russian letters in UTF8. Prior to commit of the last patch that was the
only thing "tested" against. No further review of the patch nor further
tests.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/11738#comment:5>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list