Crate table_extract [−] [src]
Utility for extracting data from HTML tables.
This library allows you to parse tables from HTML documents and iterate over their rows. There are three entry points:
Table::find_firstfinds the first table.Table::find_by_idfinds a table by its HTML id.Table::find_by_headersfinds a table that has certain headers.
Each of these returns an Option<Table>, since there might not be any
matching table in the HTML. Once you have a table, you can iterate over it
and access the contents of each Row.
Examples
Here is a simple example that uses Table::find_first to print the cells
in each row of a table:
let html = r#" <table> <tr><th>Name</th><th>Age</th></tr> <tr><td>John</td><td>20</td></tr> </table> "#; let table = table_extract::Table::find_first(html).unwrap(); for row in &table { println!( "{} is {} years old", row.get("Name").unwrap_or("<name missing>"), row.get("Age").unwrap_or("<age missing>") ) }
If the document has multiple tables, we can use Table::find_by_headers
to identify the one we want:
let html = r#" <table></table> <table> <tr><th>Name</th><th>Age</th></tr> <tr><td>John</td><td>20</td></tr> </table> "#; let table = table_extract::Table::find_by_headers(html, &["Age"]).unwrap(); for row in &table { for cell in row { println!("Table cell: {}", cell); } }
Structs
| Iter |
An iterator over the rows in a |
| Row |
A row in a |
| Table |
A parsed HTML table. |
Type Definitions
| Headers |
A map from |