[wp-trac] [WordPress Trac] #58637: HTML API: Fatal error processing document with unclosed attribute
WordPress Trac
noreply at wordpress.org
Tue Jun 27 02:27:27 UTC 2023
#58637: HTML API: Fatal error processing document with unclosed attribute
--------------------------+-----------------------------
Reporter: dlh | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: HTML API | Version: 6.2
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
The HTML tag processor triggers a fatal error (in PHP 8+) when attempting
to process a HTML string that is malformed because it ends in an unclosed
attribute.
To replicate:
{{{
$html = '<iframe width="640" height="400"
src="https://www.example.com/embed/abcdef';
$proc = new \WP_HTML_Tag_Processor( $html );
$proc->next_tag( 'iframe' );
}}}
Leads to:
`ValueError: strpos(): Argument #3 ($offset) must be contained in argument
#1 ($haystack)`
I've added a test case in the linked Pull Request. I think I can see that
the error occurs because `WP_HTML_Tag_Processor::parse_next_attribute()`
sets `$bytes_already_parsed` to one byte after the end of the document,
representing the missing closing quote of the attribute. But I'm less sure
about where in the processor a fix for the problem might go, so I've left
that open for comment for now.
I encountered a string like this as part of a content migration over other
rows of well-formed HTML. In this scenario, I wouldn't expect the tag
processor to be able to tell me anything about the string, but it would be
helpful to migration scripts like mine for the processor to handle the bad
string gracefully.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/58637>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list