Understanding WordPress Block Parsing: How wp.blocks.parse() Works

WordPress Gutenberg’s block editing system represents a fundamental shift in how content is created and stored in WordPress. At the heart of this system is the block parser, which translates between HTML content (what’s stored in the database) and JavaScript objects (what the editor works with). In this article, we’ll dive deep into how wp.blocks.parse() works and why it matters.

The Basics of Block Content in WordPress

When you create content in the WordPress block editor, what you see visually is actually represented behind the scenes as HTML with special comments. These comments serve as delimiters (or “limiters”) that mark the boundaries between different blocks.

Examining Block Content

Let’s explore this firsthand:

Create a new post in your WordPress admin area
Add a few blocks (paragraph, heading, image, etc.)
Open your browser’s developer console (right-click → Inspect → Console)
Enter the following command:

wp.data.select('core/editor').getEditedPostContent();

What you’ll see is HTML content that looks something like this:

<!-- wp:heading -->
<h2>Hello World</h2>
<!-- /wp:heading -->

<!-- wp:paragraph -->
<p>This is a paragraph of text.</p>
<!-- /wp:paragraph -->

<!-- wp:image {"id":9,"sizeSlug":"large"} -->
<figure class="wp-block-image size-large"><img src="https://example.com/wp-content/uploads/2025/05/image.jpg" alt="" class="wp-image-9"/></figure>
<!-- /wp:image -->

Notice that each block is wrapped in HTML comments. These comments mark the beginning and end of each block and contain essential information about the block type and its attributes.

How WordPress Parses Block Content

The WordPress block parser translates this HTML with comments into JavaScript objects that the editor can work with. You can see this in action by running:

wp.blocks.parse(wp.data.select('core/editor').getEditedPostContent());

This command returns an array of JavaScript objects representing each block in your content. Each object contains:

blockName: The name of the block (e.g., “core/paragraph”)
attrs: Block attributes
innerBlocks: Any nested blocks
innerHTML: The HTML content inside the block

The Role of Block Attributes

Block attributes are essential properties that define how a block looks and behaves. These attributes can come from two sources:

HTML Comment Attributes: Stored in the JSON object within the opening HTML comment
Source Attributes: Extracted from the HTML content itself

Example: Image Block Attributes

Let’s analyze an image block:

<!-- wp:image {"id":9} -->
<figure class="wp-block-image size-large"><img src="https://example.com/image.jpg" alt="" class="wp-image-9"/></figure>
<!-- /wp:image -->

In this example:

The id attribute is stored in the HTML comment
The src attribute is extracted from the img tag’s src attribute
The alt attribute is extracted from the img tag’s alt attribute

After parsing, the JS object would have attributes like:

attributes: {
  id: "9",
  url: "https://example.com/image.jpg",
  alt: "",
  caption: ""
}

Why Some Attributes Are Stored in HTML Comments

Why store some attributes in the HTML comment rather than directly in the HTML? There are several reasons:

Non-visual attributes: Some attributes don’t have a direct visual representation in HTML
Database references: IDs that reference database entries (like image IDs)
Complex data: Data structures that don’t map easily to HTML attributes
Editor settings: Attributes that control editor behavior but don’t affect the rendered output

The image ID is a perfect example. While we could potentially store it only as a class or data attribute in the HTML, keeping it in the comment ensures that the full context is preserved and the exact same image can be referenced.

The Serialization Process

Just as WordPress can parse HTML into blocks, it can also serialize blocks back into HTML. This is what happens when your content is saved to the database.

Try running this command:

wp.blocks.serialize(wp.blocks.parse(wp.data.select('core/editor').getEditedPostContent()));

This command:

Gets the current post content
Parses it into block objects
Serializes those objects back into HTML

The result should look very similar to your original post content, with the block comments and HTML structure preserved.

The Complete Block Lifecycle

To summarize the full lifecycle of block content in WordPress:

Editor Interface: You interact with visual blocks in the editor
Block Objects: The editor works with JavaScript objects representing blocks
Serialization: When saving, blocks are converted to HTML with comment delimiters
Database Storage: This HTML is stored in the WordPress database
Parsing: When loading the editor, HTML is parsed back into block objects
Rendering: The blocks are rendered in the editor interface

Practical Applications

Understanding how wp.blocks.parse() works has practical applications:

Custom Block Development: Properly define attribute sources
Content Migration: Convert legacy content to blocks
Custom Integrations: Programmatically interact with block content
Troubleshooting: Debug issues with block serialization/parsing

The block parsing system is fundamental to how WordPress manages content in the Gutenberg era. The wp.blocks.parse() function translates HTML with special comments into JavaScript objects that the editor can work with, while carefully preserving all the attributes and structure needed for the blocks to function properly.

This separation between storage format (HTML) and editing format (JavaScript objects) gives WordPress the flexibility to maintain backward compatibility while offering a modern editing experience.

By understanding this process, you gain deeper insight into how WordPress handles blocks, which is invaluable for development, troubleshooting, and creating custom solutions with the block editor.

Kishan Jasani