Tagparser
A lightweight Rust library for parsing HTML tags with powerful filtering capabilities.
Features
- Extract any HTML tags from HTML content
- Filter tags by attribute name (e.g., find all links with
href
attribute) - Filter tags by attribute value (e.g., find all links to a specific URL)
- Extract text content from inside tags (e.g., get link text without HTML)
- Extract attribute values from tags (e.g., get all URLs from links)
- Simple and intuitive API
- Command-line interface for quick parsing
Extract any html tags from html page
Installation
You can install Tagparser using cargo:
cargo add tagparser
Usage
Here's an example of how to use Tagparser lib:
use Parser;
As a result, all "a" and "p" tags will be displayed.
["<a href='https://github.com/tenqz'>Test link</a>"]
["<p>test p tag</p>"]
Filtering by Attributes
You can also filter tags by their attributes:
use parse_tags_with_attr;
Output:
All links: ["<a href='https://github.com/tenqz'>Link 1</a>", "<a class='button' href='https://example.com'>Link 2</a>"]
Button links: ["<a class='button' href='https://example.com'>Link 2</a>"]
GitHub links: ["<a href='https://github.com/tenqz'>Link 1</a>"]
Extracting Content from Tags
You can extract just the text content from inside tags:
use extract_tag_content;
Output:
Link texts: ["GitHub", "Rust Language"]
Paragraph texts: ["This is a <strong>paragraph</strong> with text."]
Extracting Attribute Values
You can extract values of specific attributes from tags:
use extract_attribute_values;
Command Line Usage
You can also use Tagparser as a command-line tool:
# Basic usage - extract all tags of a specific type
# Filter by attribute - extract all tags with a specific attribute
# Filter by attribute value - extract tags with a specific attribute value
# Extract content - extract only the text content inside tags
# Extract attribute values - extract values of a specific attribute
Development
Running Tests
The project includes a comprehensive test suite. To run the tests:
The tests are organized into:
- Unit Tests - Testing individual functions and methods
- Integration Tests - Testing the CLI interface
- Documentation Tests - Ensuring examples in documentation work correctly
Project Structure
tagparser/
├── src/
│ ├── parser.rs # Core parsing functionality
│ ├── lib.rs # Library API
│ └── main.rs # CLI implementation
├── tests/
│ ├── parser_tests.rs # Tests for parsing functionality
│ └── cli_tests.rs # Tests for CLI interface
└── README.md