WordPress Gutenberg’s block editing system represents a fundamental shift in how content is created and stored in WordPress. At the heart of this system is the block parser, which translates between HTML content (what’s stored in the database) and JavaScript objects (what the editor works with). In this article, we’ll dive deep into how wp.blocks.parse()
works and why it matters.
The Basics of Block Content in WordPress
When you create content in the WordPress block editor, what you see visually is actually represented behind the scenes as HTML with special comments. These comments serve as delimiters (or “limiters”) that mark the boundaries between different blocks.
Examining Block Content
Let’s explore this firsthand:
- Create a new post in your WordPress admin area
- Add a few blocks (paragraph, heading, image, etc.)
- Open your browser’s developer console (right-click → Inspect → Console)
- Enter the following command:
wp.data.select('core/editor').getEditedPostContent();
What you’ll see is HTML content that looks something like this:
<!-- wp:heading -->
<h2>Hello World</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>This is a paragraph of text.</p>
<!-- /wp:paragraph -->
<!-- wp:image {"id":9,"sizeSlug":"large"} -->
<figure class="wp-block-image size-large"><img src="https://example.com/wp-content/uploads/2025/05/image.jpg" alt="" class="wp-image-9"/></figure>
<!-- /wp:image -->
Notice that each block is wrapped in HTML comments. These comments mark the beginning and end of each block and contain essential information about the block type and its attributes.
How WordPress Parses Block Content
The WordPress block parser translates this HTML with comments into JavaScript objects that the editor can work with. You can see this in action by running:
wp.blocks.parse(wp.data.select('core/editor').getEditedPostContent());
This command returns an array of JavaScript objects representing each block in your content. Each object contains:
blockName
: The name of the block (e.g., “core/paragraph”)attrs
: Block attributesinnerBlocks
: Any nested blocksinnerHTML
: The HTML content inside the block
The Role of Block Attributes
Block attributes are essential properties that define how a block looks and behaves. These attributes can come from two sources:
- HTML Comment Attributes: Stored in the JSON object within the opening HTML comment
- Source Attributes: Extracted from the HTML content itself
Example: Image Block Attributes
Let’s analyze an image block:
<!-- wp:image {"id":9} -->
<figure class="wp-block-image size-large"><img src="https://example.com/image.jpg" alt="" class="wp-image-9"/></figure>
<!-- /wp:image -->
In this example:
- The
id
attribute is stored in the HTML comment - The
src
attribute is extracted from theimg
tag’ssrc
attribute - The
alt
attribute is extracted from theimg
tag’salt
attribute
After parsing, the JS object would have attributes like:
attributes: {
id: "9",
url: "https://example.com/image.jpg",
alt: "",
caption: ""
}
Why Some Attributes Are Stored in HTML Comments
Why store some attributes in the HTML comment rather than directly in the HTML? There are several reasons:
- Non-visual attributes: Some attributes don’t have a direct visual representation in HTML
- Database references: IDs that reference database entries (like image IDs)
- Complex data: Data structures that don’t map easily to HTML attributes
- Editor settings: Attributes that control editor behavior but don’t affect the rendered output
The image ID is a perfect example. While we could potentially store it only as a class or data attribute in the HTML, keeping it in the comment ensures that the full context is preserved and the exact same image can be referenced.
The Serialization Process
Just as WordPress can parse HTML into blocks, it can also serialize blocks back into HTML. This is what happens when your content is saved to the database.
Try running this command:
wp.blocks.serialize(wp.blocks.parse(wp.data.select('core/editor').getEditedPostContent()));
This command:
- Gets the current post content
- Parses it into block objects
- Serializes those objects back into HTML
The result should look very similar to your original post content, with the block comments and HTML structure preserved.
The Complete Block Lifecycle
To summarize the full lifecycle of block content in WordPress:
- Editor Interface: You interact with visual blocks in the editor
- Block Objects: The editor works with JavaScript objects representing blocks
- Serialization: When saving, blocks are converted to HTML with comment delimiters
- Database Storage: This HTML is stored in the WordPress database
- Parsing: When loading the editor, HTML is parsed back into block objects
- Rendering: The blocks are rendered in the editor interface
Practical Applications
Understanding how wp.blocks.parse()
works has practical applications:
- Custom Block Development: Properly define attribute sources
- Content Migration: Convert legacy content to blocks
- Custom Integrations: Programmatically interact with block content
- Troubleshooting: Debug issues with block serialization/parsing
The block parsing system is fundamental to how WordPress manages content in the Gutenberg era. The wp.blocks.parse()
function translates HTML with special comments into JavaScript objects that the editor can work with, while carefully preserving all the attributes and structure needed for the blocks to function properly.
This separation between storage format (HTML) and editing format (JavaScript objects) gives WordPress the flexibility to maintain backward compatibility while offering a modern editing experience.
By understanding this process, you gain deeper insight into how WordPress handles blocks, which is invaluable for development, troubleshooting, and creating custom solutions with the block editor.
Leave a Reply