Skip to main content

Node Inputs

Required Fields

HTML Content:
The HTML content to parse and extract tags from.
Example:
<div class="container">
  <h1 id="title">Welcome</h1>
  <p class="text">Content here</p>
  <a href="https://example.com">Link</a>
</div>

Optional Fields

Tag Names:
Specific HTML tags to extract. If empty, all tags are extracted.
Example: ["h1", "h2", "h3", "a", "img"]
Extract Attributes:
Whether to extract tag attributes.
Example: true
Default: true
Extract Text Content:
Whether to extract text content within tags.
Example: true
Default: true
Remove Duplicates:
Remove duplicate tag occurrences from results.
Example: true
Default: false

Node Output

Extracted Tags:
A structured list of all extracted HTML tags with their attributes and content.
Example Output:
{
  "total_tags": 4,
  "tags": [
    {
      "tag": "div",
      "attributes": {
        "class": "container"
      },
      "position": 0,
      "has_children": true
    },
    {
      "tag": "h1",
      "attributes": {
        "id": "title"
      },
      "text_content": "Welcome",
      "position": 1
    },
    {
      "tag": "p",
      "attributes": {
        "class": "text"
      },
      "text_content": "Content here",
      "position": 2
    },
    {
      "tag": "a",
      "attributes": {
        "href": "https://example.com"
      },
      "text_content": "Link",
      "position": 3
    }
  ],
  "summary": {
    "div": 1,
    "h1": 1,
    "p": 1,
    "a": 1
  }
}

Node Functionality

The Tag Extractor node:
  • Extracts HTML tags from HTML content.
  • Captures tag attributes (class, id, href, src, etc.).
  • Extracts text content within tags.
  • Supports filtering by specific tag names.
  • Provides tag occurrence statistics.
  • Useful for HTML analysis, SEO audits, and content extraction.
I