[wp-trac] [WordPress Trac] #61545: HTML API: Performance Improvements for 6.7
WordPress Trac
noreply at wordpress.org
Mon Jul 1 21:41:33 UTC 2024
#61545: HTML API: Performance Improvements for 6.7
--------------------------------------+--------------------------
Reporter: dmsnell | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: 6.7
Component: HTML API | Version: trunk
Severity: normal | Resolution:
Keywords: has-patch has-unit-tests | Focuses: performance
--------------------------------------+--------------------------
Description changed by dmsnell:
Old description:
> The HTML API is already efficient, but it can (possibly) be better.
>
> There are multiple ways to potentially improve the performance of the Tag
> Processor, HTML Processor, and HTML Decoder. This ticket is a tracking
> ticket for those experiments and changes.
New description:
The HTML API is already efficient, but it can (possibly) be better.
There are multiple ways to potentially improve the performance of the Tag
Processor, HTML Processor, and HTML Decoder. This ticket is a tracking
ticket for those experiments and changes
== Experiments
=== Optimize low-level details: [https://github.com/WordPress/wordpress-
develop/pull/6890 #6890]
**Hypothesis**: by auditing various low-level functions and adding a few
micro-optimizations, the HTML Tag Processor will scan faster.
**Testing results**: This change seems to have a marked improvement in
scanning times, but since there are several changes incorporated into the
patch it's unclear if any specific change was dominant. Of interest are a
few places in the hot patch where a branch was removed.
Improvement in the token-scanning measures between 3.5% and 7.5% on
average, a small tail of documents are slower, and a long tail are much
faster, even above 15% faster. It's unclear what exactly directs the
performance behaviors, but it's complicated and document-dependent.
**Conclusion**:
- Merge this patch.
- Continue trying to build a model for what directs the
performance behaviors.
=== Replace the attribute associative array with a simple list.
[https://github.com/WordPress/wordpress-develop/pull/5774 #5774]
**Hypothesis**: by replacing the associative array and hash lookup of
parsed attributes with a numeric array the Tag Processor will perform less
work and go faster. This should be accomplished by skipping that hash
lookup for the associative array of known attributes and by skipping the
comparable lookup for detecting duplicates. In HTML, the first attribute
is the real one, so it's okay to track a location for every duplicate
attribute and simply find the //first// parsed one as the real attribute.
**Testing results**: TBD.
**Conclusion**:
- Need to update the PR to rebase against the current `trunk`.
- Need to run benchmarks against one of the large datasets, like
the `top100` list.
--
--
Ticket URL: <https://core.trac.wordpress.org/ticket/61545#comment:3>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list