WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

adamziel · 2022-07-17T16:34:57Z

Description

Introduces WP_HTML_Tag_Processor: A PHP class to update HTML markup (overview issue).

Dynamic blocks often need to inject a CSS class name or set <img src /> in the rendered block HTML markup but lack the means to do so:

At the moment, every block comes up with a custom-tailored regular expression. It is limiting, error-prone, and demands code duplication. It also leads to PRs such as strip whitespaces in render_block_core_cover.

Here's just a few instances where this would be useful:

Not having to strip HTML whitespaces in the cover block
Removing the link in the site-logo block
Injecting classes in the Cover block, the gallery block, the heading block, all static blocks, Duotone
Having a reliable way of targeting the first tag opener and injecting class names in the Block Supports API.
Potentially in another iteration, injecting the featured image in the cover block.

This PR reuses a lot of the great @dmsnell's explorations from #42507.

Architecture explained

WP_HTML_Tag_Processor provides a standard and highly-restricted HTML modification API. It tokenizes an HTML document to find specific tags. Then, it transforms those tags by adding, removing, or updating the values of the HTML attributes within that tag (opener). Importantly, it does not fully parse HTML or recurse into the HTML structure. Instead, WP_HTML_Tag_Processor moves forward linearly through a document and only parses the HTML tag openers.

In practical terms this API purposefully tries to address only the subset of HTML modifications that are commonly needed when modifying stored HTML from server-side filters and code, such as when adding class names to block markup.

The most significant benefits of the approach taken:

WP_HTML_Tag_Processor works fast and has a tiny memory footprint (for a PHP program). Adding a class and an attribute to 300k tags in a 12MB HTML document takes ~800ms and 24 MB of memory on my Mac Pro.
API is simple and open for further extension.
It covers the most common repeating use cases that we see in the Gutenberg project.

The following constraints were applied to satisfy all of the above:

WP_HTML_Tag_Processor doesn't try to understand the relationship between tags. This prevents us from extrapolating into more complex queries on the HTML structure. This is a perk because it allows us to avoid handling all the complexity and overhead of semantic HTML parsing (as needed when building a DOM tree).
Tags can be matched by tag name, class name, and/or the match offset.
It is possible to set new attributes and update or remove existing attributes. You cannot change the content/markup inside the HTML tag. Recursing into the markup (such as the HTML inside a <p> tag) would require more overhead in the parser and would require handling the array of situations where we find malformed HTML, such as with unmatched or overlapping tags).

Feedback request

Please reply to this PR with your thoughts about:

Adding WP_HTML_Tag_Processor to core
The public API proposed in this PR

Please ignore the failing CI checks, missing docstrings, and implementation details for now.

Can this really parse HTML like a browser?

This PR includes an HTML tokenizer and an HTML tag parser. It doesn't build a DOM tree, reason about entities, or handle mismatched, overlapping, or malformed tags.

The walker linearly scans the input HTML and visits every tag opener (e.g. <p> or <div class="wp-block-group">). By implementing the WHATWG HTML parsing spec we maintain parity with how browsers will parse the HTML, both in how they decode normal HTML as well as how they handle abnormalities, such as with unexpected HTML attributes like <div =5 title="<br />" a="yes"b=no>

The public API proposed in this PR

Here's what an interface would look like:

interface WP_HTML_Tag_Processor {

	/**
	 * @param $html HTML to process.
	 */
	public function __construct( string $html );

	/**
	 * Finds the next tag matching the $query.
	 * @return Whether a tag was matched.
	 */
	public function next_tag( ?array $query = null ) : bool;

	/**
	 * Updates or creates a new attribute on the currently matched tag.
	 *
	 * @param $name The attribute name to target.
	 * @param $value The new attribute value.
	 */
	public function set_attribute( string $name, string|boolean $value ) : void;

	/**
	 * Removes an attribute of the currently matched tag.
	 *
	 * @param $name The attribute name to remove.
	 */
	public function remove_attribute( string $name ) : void;

	/**
	 * Adds a new class name to the currently matched tag.
	 *
	 * @param $class_name The class name to add.
	 */
	public function add_class( string $class_name )  : void;

	/**
	 * Removes a class name from the currently matched tag.
	 *
	 * @param $class_name The class name to remove.
	 */
	public function remove_class( string $class_name ) : void;

	/**
	 * Returns the updated HTML document.
	 */
	public function __toString() : string;

}

Usage examples

Setting an attribute:

$w = new WP_HTML_Tag_Processor('<div id="first"><img /></div>');
$w->next_tag('img');
$w->set_attribute('src', '/wp-content/logo.png');
echo $w;
// <div id="first"><img src="/wp-content/logo.png" /></div>

Updating multiple tags:

$html = <<<HTML
<div class="main with-border" id="first">
    <span class="not-main bold with-border" id="second">
        Text
    </span>
</div>
HTML;
$w = new WP_HTML_Tag_Processor( $html );
$w->next_tag();
$w->remove_class( 'main' );
$w->remove_class( 'with-border' );
$w->set_attribute( 'id', 'second' );
$w->set_attribute( 'data-foo', 'bar' );
$w->next_tag();
$w->remove_attribute( 'class' );
echo $w;
/*
<div data-foo="bar"  id="second">
    <span  id="second">
        Text
    </span>
</div>
*/

Finding a div with a specific class name:

$w = new WP_HTML_Tag_Processor('<div id="first"><img /><div class="BtnGroup"></div></div>');
$w->next_tag(
	array(
		'tag_name'   => 'div',
		'class_name' => 'BtnGroup',
	)
);

No match found:

$w = new WP_HTML_Tag_Processor('<div id="first"><img /></div>');
$w->next_tag('span'); // There's no span
$w->set_attribute('id', 'first');
$w->next_tag('img');
$w->set_attribute('src', '/wp-content/logo.png');
echo $w;

// Nothing has changed – the parser only moves forward and it reached
// the end after trying to match span:
// <div id="first"><img /></div>

Updating every tag:

while ( $w->next_tag() ) {
    $w->set_attribute( 'data-foo', 'bar' );
}

Alternative ideas considered

Using DOMDocument was extensively discussed – it's not installed on every host so you'd have to provide a polyfill. Even if it was available everywhere, it's based on libxml2 designed to parse XML. Libxml2 does not implement the WHATWG HTML parsing spec and does not support HTML5.

Ad-hoc per-use-case regular expressions were also discussed and explored. It is hard to come up with a correct one and it's easy to run into problems like worrying about the specific sequence of whitespace used in the markup. They're also a maintenance and duplication burden and prone to errors.

Alternative API shapes considered

There was an attempt at a fluid iterface:

$w
   ->next_tag('img')
   ->set_attribute('src', get_thumbnail())
   ->next_tag('span')
   ->add_class('has_thumbnail');

But the streaming nature of this code made it inconvenient. This class must communicate whether it matched a tag and there are two ways to do it:

// We could return false from the `next_tag`, breaking the fluid interface
$w->next_tag() === false;
$w->next_tag()->set_attribute(); // Fatal error

// Or we could add a has_match() method, making it harder to update
// all matches:
while( $w->next_tag() && $w->has_match() {
   // ...
}

In the end, @dmsnell, @gziolo, and I realized that a fluid interface is about as verbose as the lack of it:

$w->next_tag('img');
$w->set_attribute('src', get_thumbnail());
$w->next_tag('span');
$w->add_class('has_thumbnail');

Since the latter makes for a simpler code, it became the version included in this proposal.

Security implications

Outputting a malformed HTML would create an XSS attack vector so we better test for all kinds of corner cases, e.g. using the same test examples as KSES.

To stay safe, the API is restricted to adding, removing, and replacing HTML attributes in tag openers. We know where they start and end by following the HTML parsing specification to the letter. This API processes the data in a linear fashion and does not recurse or reason about the tree structure at all. It can inject a string like id="$escaped_id_value" after a tag start, but it cannot do anything else like change the content or create new tags.

In my testing, this PR handles weird inputs in the same way a browser would. Still, we need an extensive test suite to be extra sure about that.

It is also worth noting that a typical input to WP_HTML_Tag_Processor will have been already filtered through KSES.

Related resources

Testing

Run unit tests with:

npm run test-unit-php -- --group html

# Or without docker:
 ./vendor/bin/phpunit --no-configuration ./phpunit/html/wp-html-tag-processor-test.php

Next steps

Consider disallowing on* attributes such as onClick. As @peterwilsoncc mentioned:

A nagging concern I have is that such a feature would make it easier for code to be added without regard for the unfiltered_html permission. A third party developer could add a JavaScriptEvent property allowing users to add script tags without kses recognizing them as such.

dmsnell · 2022-07-18T06:33:30Z

do you think we really need the updater function for set_attribute? I suspect providing a string value should be quite useful on its own, and adding the function updater will certainly encourage behaviors we may not want, such as the example here where we create a new closure just to return a static string value (though I'm also thinking more about cases like people doing ->set_attribute('class', function( $class ) { return $class . ' another-class'; } )

adamziel · 2022-07-18T20:53:17Z

do you think we really need the updater function for set_attribute?

It's either this or a more sophisticated behavior for remove_class/add_class to support multiple class updates for the same element. I think you have a point so I'll play with the latter.

lib/experimental/class-wp-html-tokenizer-2.php

adamziel · 2022-07-20T10:26:02Z

@dmsnell @gziolo I updated this quite a bit! Test, docs, code standards are still missing. On the up side I made it more internally consistent, simplified a bunch of code paths, limited the available public methods, and added a fresh usage example with a more involved HTML snippet at the bottom of the file. This one is ready for more in-depth comments now.

gziolo · 2022-07-21T14:11:27Z

I have some general feedback to share. I checked existing issues and tickets related to HTML scanning in PHP for block attributes stored as block content – aka sourced attributes. In the past, it was primarily discussed in relation to REST API so you could get the block data as part of the post content object. The original issue is #4763 which I moved to WordPress Trac at some point ticket#53603. Two existing implementations try a different approach to scanning the HTML of the block, which you might find interesting:

Prototype: Server-side block attributes sourcing #18414 (by Andrew Duthie)

the parsing relies on DOMDocument, querying using DOMXPath by first converting the block type attribute selector to an equivalent XPath selector using a bundled, modified version of a third-party library PHP Selector.
wp-rest-blocks plugin (by Jonny Harris)

In my plugin, I used a library called pQuery, as a way to use CSS selectors in PHP. Whatever goes into core, may want to use a different approach.

The most complex attributes sourcing in core blocks is probably used with the Table block:

gutenberg/packages/block-library/src/table/block.json

Lines 14 to 124 in 4ed1c11

    
           "caption": { 
        
           	"type": "string", 
        
           	"source": "html", 
        
           	"selector": "figcaption", 
        
           	"default": "" 
        
           }, 
        
           "head": { 
        
           	"type": "array", 
        
           	"default": [], 
        
           	"source": "query", 
        
           	"selector": "thead tr", 
        
           	"query": { 
        
           		"cells": { 
        
           			"type": "array", 
        
           			"default": [], 
        
           			"source": "query", 
        
           			"selector": "td,th", 
        
           			"query": { 
        
           				"content": { 
        
           					"type": "string", 
        
           					"source": "html" 
        
           				}, 
        
           				"tag": { 
        
           					"type": "string", 
        
           					"default": "td", 
        
           					"source": "tag" 
        
           				}, 
        
           				"scope": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "scope" 
        
           				}, 
        
           				"align": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "data-align" 
        
           				} 
        
           			} 
        
           		} 
        
           	} 
        
           }, 
        
           "body": { 
        
           	"type": "array", 
        
           	"default": [], 
        
           	"source": "query", 
        
           	"selector": "tbody tr", 
        
           	"query": { 
        
           		"cells": { 
        
           			"type": "array", 
        
           			"default": [], 
        
           			"source": "query", 
        
           			"selector": "td,th", 
        
           			"query": { 
        
           				"content": { 
        
           					"type": "string", 
        
           					"source": "html" 
        
           				}, 
        
           				"tag": { 
        
           					"type": "string", 
        
           					"default": "td", 
        
           					"source": "tag" 
        
           				}, 
        
           				"scope": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "scope" 
        
           				}, 
        
           				"align": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "data-align" 
        
           				} 
        
           			} 
        
           		} 
        
           	} 
        
           }, 
        
           "foot": { 
        
           	"type": "array", 
        
           	"default": [], 
        
           	"source": "query", 
        
           	"selector": "tfoot tr", 
        
           	"query": { 
        
           		"cells": { 
        
           			"type": "array", 
        
           			"default": [], 
        
           			"source": "query", 
        
           			"selector": "td,th", 
        
           			"query": { 
        
           				"content": { 
        
           					"type": "string", 
        
           					"source": "html" 
        
           				}, 
        
           				"tag": { 
        
           					"type": "string", 
        
           					"default": "td", 
        
           					"source": "tag" 
        
           				}, 
        
           				"scope": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "scope" 
        
           				}, 
        
           				"align": { 
        
           					"type": "string", 
        
           					"source": "attribute", 
        
           					"attribute": "data-align" 
        
           				} 
        
           			} 
        
           		} 
        
           	} 
        
           }

I don’t know if that’s something you considered when discussing HTML processing on the server, but it feels very close despite being on a completely different level of abstraction.

I checked the current implementation, and it looks like the WP_HTML_Updater is way much simpler than the solutions I mentioned above. I don't think it would be possible to reuse it in the current form for attribute sourcing because it only finds HTML tags as a flat list, and therefore it doesn't do any analysis of the relations between nodes. Is the goal to keep the API very limited in the long run? I'm genuinely curious how you see this API, so we have a mutual understanding of where exactly this class would be applicable.

lib/experimental/class-wp-html-updater.php

adamziel · 2022-07-26T13:03:17Z

The most complex attributes sourcing in core blocks is probably used with the Table block:
I checked the current implementation, and it looks like the WP_HTML_Updater is way much simpler than the solutions I mentioned above. I don't think it would be possible to reuse it in the current form for attribute sourcing because it only finds HTML tags as a flat list, and therefore it doesn't do any analysis of the relations between nodes. Is the goal to keep the API very limited in the long run? I'm genuinely curious how you see this API, so we have a mutual understanding of where exactly this class would be applicable.

@gziolo Two pieces are missing for us to be able to do that:

Tracking the current tree path
Parsing CSS selectors

Perhaps it wouldn't be too hard to add both in one of the future iterations.

lib/experimental/class-wp-html-walker.php

lib/experimental/class-wp-html-walker-test.php

adamziel · 2022-07-28T11:50:44Z

This PR needs your feedback to progress!

It proposes a lean HTML processor for PHP:

$w = new WP_HTML_Walker('<div id="first"><img /></div>');
$w->next_tag('img');
$w->set_attribute('src', '/wp-content/logo.png');
echo $w;
// <div id="first"><img src="/wp-content/logo.png" /></div>

It will be a tremendous help for working with blocks in PHP. See the description for more details and use-cases.

The only way to move forward now is to discuss the proposed public interface and the idea itself. Please comment with your thoughts on both!

@dmsnell @spacedmonkey @aristath @schlessera @fabiankaegy @carolinan @ntsekouras @gziolo @scruffian @draganescu @getdave @swissspidy @azaozz @hellofromtonya @peterwilsoncc @youknowriad @mcsf @ellatrix @Mamaduka @noisysocks @mtias @youknowriad @andronocean @ciampo @aaronrobertshaw @c4rl0sbr4v0 @jorgefilipecosta @paaljoachim @talldan @oandregal @ntsekouras @ndiego @MaggieCabrera @sarayourfriend @ajlende

swissspidy · 2022-07-28T13:16:23Z

Is there more background about the design of this API and why it is needed? What other alternatives have been considered? What about the security implications of writing your own HTML modifier like this?

If the primary use case of this is HTML modifications in blocks, why does (a tiny wrapper around) DOMDocument / DOMDocumentFragment not suffice?

Aside: Instead of pinging dozens of people I'd suggest bringing this up in dev chat and writing a make/core post to get more feedback :)

adamziel · 2022-07-28T13:37:16Z

Is there more background about the design of this API and why it is needed?

@swissspidy Here's just a few instances where this would be useful:

Not having to strip HTML whitespaces in the cover block
Removing the link in the site-logo block
Injecting classes in the Cover block, the gallery block, the heading block, all static blocks, Duotone
Having a reliable way of targeting the first tag opener and injecting class names in the Block Supports API.
Potentially in another iteration, injecting the featured image in the cover block.

What other alternatives have been considered?

Using DOMDocument was extensively discussed – it's not installed on every host so you'd have to provide a polyfill. Even if it was available everywhere, it can't match and modify imperfect HTML that well.

Ad-hoc per-use-case regular expressions were also discussed and explored. It is hard to come up with a correct one and it's easy to run into problems like worrying about the specific sequence of whitespace used in the markup. They're also a maintenance and duplication burden and prone to errors.

What about the security implications of writing your own HTML modifier like this?

Outputting a malformed HTML would create an XSS attack vector so we better test for all kinds of corner cases, e.g. using the same test examples as KSES.

To stay safe, the API is restricted to adding, removing, and replacing HTML attributes. We know where they start and end by following the HTML parsing specification to the letter. This API can inject a string like id="$escaped_id_value" after a tag start, but it cannot change tag’s text or html content or create new tags.

In my testing, this PR handles weird inputs in the same way a browser would. Still, we need an extensive test suite to be extra sure about that.

It is also worth noting that a typical input to WP_HTML_Walker will have been already filtered through KSES.

… into add/html-tokenizer-2

gziolo · 2022-09-23T06:01:44Z

@adamziel opened #44410 to track follow-up issues as we plan to land this PR in its current shape as soon as CI becomes green. We have been working on this task for over two months and got a ton of great feedback here and in https://make.wordpress.org/core/2022/08/19/a-new-system-for-simply-and-reliably-updating-html-attributes/ that helped to shape the current solution. The plan now is to include this code in the Gutenberg plugin that's going to be included in 14.3 release (in 3 weeks) with the intent to ship it in WordPress 6.2 (Q1 of the next year).

getdave · 2022-09-23T08:33:31Z

Props to everyone who worked so hard to land this one 👏

azaozz · 2022-09-23T18:37:16Z

As I commented a month ago (in a comment that GH very unhelpfully hides) this is not ready for "prime time" yet. It's lacking clarity and documentation about what security features this new API has or doesn't have. Imho this is a blocker.

There are several @TODO that still await a decision and implementation. Most are for security features. Some are mentioned in #44410 others (like valid attribute names) are not.

From what I understand it looks like this new API is going to require extenders to properly escape all new and replacement strings. If that's the case, there at least should be a sufficient section in the docblock(s) explaining what and how to do that, and perhaps mentioning the "danger zone" / edge cases. These sections will also need examples of how exceptions should e handled, strings like onmouseover="document.write('<script src=\'https://bad-code.com/1.js\'></script>')" and perhaps stuff like href="javascript://alert(1)".

This commit pulls in the HTML Tag Processor from the Gutenbeg repository. The Tag Processor attempts to be an HTML5-spec-compliant parser that provides the ability in PHP to find specific HTML tags and then add, remove, or update attributes on that tag. It provides a safe and reliable way to modify the attribute on HTML tags. ```php // Add missing `rel` attribute to links. $p = new WP_HTML_Tag_Processor( $block_content ); if ( $p->next_tag( 'A' ) && empty( $p->get_attribute( 'rel' ) ) ) { $p->set_attribute( 'noopener nofollow' ); } return $p->get_updated_html(); ``` Introduced originally in WordPress/gutenberg#42485 and developed within the Gutenberg repository, this HTML parsing system was built in order to address a persistent need (properly modifying HTML tag attributes) and was motivated after a sequence of block editor defects which stemmed from mismatches between actual HTML code and expectectations for HTML input running through existing naive string-search-based solutions. The Tag Processor is intended to operate fast enough to avoid being an obstacle on page render while using as little memory overhead as possible. It is practically a zero-memory-overhead system, and only allocates memory as changes to the input HTML document are enqueued, releasing that memory when flushing those changes to the document, moving on to find the next tag, or flushing its entire output via `get_updated_html()`. Rigor has been taken to ensure that the Tag Processor will not be consfused by unexpected or non-normative HTML input, including issues arising from quoting, from different syntax rules within `<title>`, `<textarea>`, and `<script>` tags, from the appearance of rare but legitimate comment and XML-like regions, and from a variety of syntax abnormalities such as unbalanced tags, incomplete syntax, and overlapping tags. The Tag Processor is constrained to parsing an HTML document as a stream of tokens. It will not build an HTML tree or generate a DOM representation of a document. It is designed to start at the beginning of an HTML document and linearly scan through it, potentially modifying that document as it scans. It has no access to the markup inside or around tags and it has no ability to determine which tag openers and tag closers belong to each other, or determine the nesting depth of a given tag. It includes a primitive bookmarking system to remember tags it has previously visited. These bookmarks refer to specific tags, not to string offsets, and continue to point to the same place in the document as edits are applied. By asking the Tag Processor to seek to a given bookmark it's possible to back up and continue processsing again content that has already been traversed. Attribute values are sanitized with `esc_attr()` and rendered as double-quoted attributes. On read they are unescaped and unquoted. Authors wishing to rely on the Tag Processor therefore are free to pass around data as normal strings. Convenience methods for adding and removing CSS class names exist in order to remove the need to process the `class` attribute. ```php // Update heading block class names $p = new WP_HTML_Tag_Processor( $html ); while ( $p->next_tag() ) { switch ( $p->get_tag() ) { case 'H1': case 'H2': case 'H3': case 'H4': case 'H5': case 'H6': $p->remove_class( 'wp-heading' ); $p->add_class( 'wp-block-heading' ); break; } return $p->get_updated_html(); ``` The Tag Processor is intended to be a reliable low-level library for traversing HTML documents and higher-level APIs are to be built upon it. Immediately, and in Core Gutenberg blocks it is meant to replace HTML modification that currently relies on RegExp patterns and simpler string replacements. See the following for examples of such replacement: WordPress/gutenberg@1315784 https://github.com/WordPress/gutenberg/pull/45469/files#diff-dcd9e1f9b87ca63efe9f1e834b4d3048778d3eca41aa39c636f8b16a5bb452d2L46 WordPress/gutenberg#46625 Co-Authored-By: Adam Zielinski <adam@adamziel.com> Co-Authored-By: Bernie Reiter <ockham@raz.or.at> Co-Authored-By: Grzegorz Ziolkowski <grzegorz@gziolo.pl>

dmsnell mentioned this pull request Jul 18, 2022

WIP: Introduce HTML processor for server-side block modification on render #42507

Closed

dmsnell reviewed Jul 19, 2022

View reviewed changes

lib/experimental/class-wp-html-tokenizer-2.php Outdated Show resolved Hide resolved

dmsnell reviewed Jul 19, 2022

View reviewed changes

lib/experimental/class-wp-html-tokenizer-2.php Outdated Show resolved Hide resolved

dmsnell mentioned this pull request Jul 19, 2022

[Heading block] Add a wp-block-heading CSS class #42122

Merged

gziolo added the [Type] Technical Prototype Offers a technical exploration into an idea as an example of what's possible label Jul 21, 2022

dmsnell reviewed Jul 26, 2022

View reviewed changes

adamziel changed the title ~~HTML updater~~ WP_HTML_Walker: Dynamically update block markup HTML attributes in PHP Jul 26, 2022

adamziel changed the title ~~WP_HTML_Walker: Dynamically update block markup HTML attributes in PHP~~ WP_HTML_Walker: Inject dynamic data to block HTML markup in PHP Jul 26, 2022

adamziel self-assigned this Jul 26, 2022

adamziel commented Jul 27, 2022

View reviewed changes

lib/experimental/class-wp-html-walker.php Outdated Show resolved Hide resolved

gziolo reviewed Jul 27, 2022

View reviewed changes

adamziel force-pushed the add/html-tokenizer-2 branch 3 times, most recently from f6ef968 to 5bf0c28 Compare July 27, 2022 13:43

adamziel assigned gziolo and dmsnell Jul 27, 2022

adamziel added the [Feature] Blocks Overall functionality of blocks label Jul 27, 2022

adamziel marked this pull request as ready for review July 28, 2022 10:42

adamziel requested a review from spacedmonkey as a code owner July 28, 2022 10:42

gziolo mentioned this pull request Jul 28, 2022

Block API #41236

Open

58 tasks

adamziel mentioned this pull request Jul 28, 2022

Fix link element hover bleeding into button element default styles #42072

Merged

adamziel mentioned this pull request Jul 28, 2022

Add a CSS class to all static blocks with className supports #42269

Open

adamziel mentioned this pull request Sep 23, 2022

WP_HTML_Tag_Processor overview issue #44410

Closed

26 tasks

adamziel added 3 commits September 23, 2022 15:41

Rename $w to $p

5c2d080

Merge branch 'add/html-tokenizer-2' of github.com:WordPress/gutenberg…

509684c

… into add/html-tokenizer-2

Rename phpdoc references to $w to $p

d9d3bab

gziolo approved these changes Sep 23, 2022

View reviewed changes

gziolo requested a review from azaozz September 23, 2022 06:05

adamziel merged commit c6c5dbb into trunk Sep 23, 2022

adamziel deleted the add/html-tokenizer-2 branch September 23, 2022 06:36

github-actions bot added this to the Gutenberg 14.3 milestone Sep 23, 2022

adamziel mentioned this pull request Sep 23, 2022

Blocks: Process HTML With a Parser (Instead of Regular Expressions) #42128

Closed

adamziel mentioned this pull request Sep 30, 2022

Tag Processor: Add get_updated_html as a non-toString method of stringifying the markup #44597

Merged

bph mentioned this pull request Oct 3, 2022

6.1 Dev Notes Tracking #42961

Closed

89 tasks

andrewserong mentioned this pull request Oct 4, 2022

Try adding layout classnames to inner block wrapper #44600

Merged

pagelab mentioned this pull request Oct 10, 2022

Add a way to filter CSS classes for blocks using PHP #23223

Closed

SantosGuillamot mentioned this pull request Oct 17, 2022

🎨 Add wp-link attribute to all links WordPress/block-interactivity-experiments#81

Merged

t-hamano mentioned this pull request Oct 22, 2022

Image Block: Don't render if there is no URL set #45220

Merged

luisherranz mentioned this pull request Nov 14, 2022

🎨 Server-side rendering of wp-show in PHP WordPress/block-interactivity-experiments#100

Closed

luisherranz mentioned this pull request Nov 24, 2022

Block's sourced attributes missing with parse_blocks #46029

Closed

hellofromtonya mentioned this pull request Jan 25, 2023

Plugin: Backport PHP changes for WordPress 6.2 release #47187

Closed

85 tasks

dmsnell mentioned this pull request Jan 26, 2023

Editor: Introduce HTML Tag Processor WordPress/wordpress-develop#3920

Closed

This was referenced Feb 15, 2023

Remove the fallback implementations when WP_HTML_Tag_Processor is available in core (6.2) WordPress/create-block-theme#225

Closed

Bundle template images into theme assets and make their urls relative WordPress/create-block-theme#213

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

adamziel commented Jul 17, 2022 •

edited

Loading

dmsnell commented Jul 18, 2022

adamziel commented Jul 18, 2022

adamziel commented Jul 20, 2022 •

edited

Loading

gziolo commented Jul 21, 2022

adamziel commented Jul 26, 2022

adamziel commented Jul 28, 2022 •

edited

Loading

swissspidy commented Jul 28, 2022

adamziel commented Jul 28, 2022 •

edited

Loading

gziolo commented Sep 23, 2022

getdave commented Sep 23, 2022

azaozz commented Sep 23, 2022 •

edited

Loading

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

Conversation

adamziel commented Jul 17, 2022 • edited Loading

Description

Architecture explained

Feedback request

Can this really parse HTML like a browser?

The public API proposed in this PR

Usage examples

Alternative ideas considered

Alternative API shapes considered

Security implications

Related resources

Testing

Next steps

dmsnell commented Jul 18, 2022

adamziel commented Jul 18, 2022

adamziel commented Jul 20, 2022 • edited Loading

gziolo commented Jul 21, 2022

adamziel commented Jul 26, 2022

adamziel commented Jul 28, 2022 • edited Loading

This PR needs your feedback to progress!

swissspidy commented Jul 28, 2022

adamziel commented Jul 28, 2022 • edited Loading

gziolo commented Sep 23, 2022

getdave commented Sep 23, 2022

azaozz commented Sep 23, 2022 • edited Loading

adamziel commented Jul 17, 2022 •

edited

Loading

adamziel commented Jul 20, 2022 •

edited

Loading

adamziel commented Jul 28, 2022 •

edited

Loading

adamziel commented Jul 28, 2022 •

edited

Loading

azaozz commented Sep 23, 2022 •

edited

Loading