HTML Optimizer corrupts output
-
I just encountered an issue (via support topics) where a site running Aruba HiSpeed Cache will corrupt the markup if the HTML Optimizer feature is enabled. For example, consider the following block markup in the editor:
<!-- wp:paragraph -->
<p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>This is a Script:</p>
<!-- /wp:paragraph -->
<!-- wp:code -->
<pre class="wp-block-code"><code><script>
/* example script */
</script></code></pre>
<!-- /wp:code -->
<!-- wp:html -->
<p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>
<!-- /wp:html -->
<!-- wp:code -->
<pre class="wp-block-code"><code><style>
/* example style */
</style></code></pre>
<!-- /wp:code -->Without the HTML Optimizer enabled, this is rendered on the frontend as:
<p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>
<p>This is a Script:</p>
<pre class="wp-block-code"><code><script>
/* example script */
</script></code></pre>
<p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>
<pre class="wp-block-code"><code><style>
/* example style */
</style></code></pre>However, when the HTML Optimizer is enabled, this is the resulting output:
<p>This is an XPath: <code>/HTML/BODY </script></code></pre> <p data-od-xpath="/HTML/BODY/ </style></code></pre>
The following markup in bold is getting erroneously stripped out:
<p>This is an XPath: <code>/HTML/BODY/*[1][self::DIV]</code></p>
<p>This is a Script:</p>
<pre class="wp-block-code"><code><script>
/* example script */
</script></code></pre>
<p data-od-xpath="/HTML/BODY//*[4][self::P]">This has an XPath in a data attribute.</p>
<pre class="wp-block-code"><code><style>
/* example style */
</style></code></pre>The problematic code is this line:
$buffer = preg_replace( '@\/\*(.*?)\*\/@s', ' ', $buffer); //remove comment
Using regular expressions to manipulate HTML is highly dangerous. I highly recommend you switch to something more robust, such as the HTML API which is available as of WordPress 6.2. In particular, WordPress 6.7 introduced the ability to safely manipulate the text nodes in tags via the
::set_modifiable_text()
method.
- You must be logged in to reply to this topic.