# Capricorn
### Parse html according to configuration.
### Capricorn is a html parsing library that supports recursion and custom execution order.
[![Version info](https://img.shields.io/crates/v/capricorn.svg)](https://crates.io/crates/capricorn)
[![Downloads](https://img.shields.io/crates/d/capricorn.svg?style=flat-square)](https://crates.io/crates/capricorn)
[![docs](https://img.shields.io/badge/docs-latest-blue.svg?style=flat-square)](https://docs.rs/capricorn)
[![example branch parameter](https://github.com/ptechen/capricorn/workflows/CI/badge.svg?branch=main)]()
### Default execution order
vec![String::from("selects"),
String::from("each"),
String::from("select_params"),
String::from("nodes"),
String::from("has"),
String::from("contains")];
selects > each > (one or all or fields) > ... text_attr_html > (text or attr or html);
selects > select_params > selects > ... text_attr_html > (text or attr or html);
selects > nodes > has > contains > text_attr_html > (text or attr or html);
### Support:
| Capricorn | support | example |val type|
| :----: | :----: | :----- |:----:|
| selects element | ✔ | field_name:<br> selects: <br> - element_name | String |
| selects class | ✔ | field_name:<br> selects: <br> - .class_name | String |
| selects class element | ✔ | field_name: <br> selects: <br> - .class_name <br> - element_name | String |
| first | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> first: true | String |
| last | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> last: true | String |
| eq | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> eq: 0 | String |
| parent | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> parent: true | String |
| children | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> children: true | String |
| prev_sibling | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> prev_sibling: true | String |
| next_sibling | ✔ | field_name: <br> selects: <br> - element_name <br> nodes: <br> next_sibling: true | String |
| has_class | ✔ | field_name: <br> selects: <br> - element_name <br> has: <br> class: class_name | String |
| has_attr | ✔ | field_name: <br> selects: <br> - element_name <br> has: <br> attr: attr_name | String |
| each one | ✔ | field_name: <br> selects: <br> - element_name <br> each: <br> one: <br> selects:<br> - .class_name<br> ... | String |
| each all | ✔ | field_name: <br> selects: <br> - element_name <br> each: <br> all: <br> selects:<br> - .class_name<br> ... | Array |
| each fields | ✔ | field_name: <br> selects: <br> - element_name <br> each: <br> fields: <br> field_name: <br> selects:<br> - .class_name<br> ... <br> field_name1: <br> selects:<br> - .class_name<br> ... | Map |
| select_params | ✔ | field_name: <br> selects: <br> - element_name <br> select_params: <br> selects:<br> - .class_name<br> ... | ... |
| text | ✔ | field_name:<br> selects: <br> - element_name <br> text_attr_html: <br> text: true | String |
| attr | ✔ | field_name:<br> selects: <br> - element_name <br> text_attr_html: <br> attr: true | String |
| html | ✔ | field_name:<br> selects: <br> - element_name <br> text_attr_html: <br> html: true | String |
| text contains | ✔ | field_name:<br> selects: <br> - element_name <br> contains: <br> contains: <br> text: <br> - test | String |
| text not contains | ✔ | field_name:<br> selects: <br> - element_name <br> contains: <br> not_contains: <br> text: <br> - test | String |
| html contains | ✔ | field_name:<br> selects: <br> - element_name <br> contains: <br> contains: <br> html: <br> - test | String |
| html not contains | ✔ | field_name:<br> selects: <br> - element_name <br> contains: <br> not_contains: <br> html: <br> - test | String |
| exec order | ✔ | field_name:<br> exec_order: <br> - selects <br> - has <br> - nodes <br> selects: <br> - element_name <br> has: <br> class: class_name <br> nodes: <br> first: true | String |
| data format splits | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> splits: <br> - { key: str } | Array |
| data format splits | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> splits: <br> - { key: str, index: 0 } | String |
| data format replaces | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> replaces: <br> - str | String |
| data format deletes | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> deletes: <br> - str | String |
| data format find | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> find: regex | String |
| data format find_iter | ✔ | field_name:<br> selects: <br> - element_name <br> data_format: <br> find_iter: regex | Array |
| Multi-version regular matching err | ✔ |regexes_match_parse_html: <br> - regex: regex <br> version: 1 <br> err: err_msg | Err |
| Multi-version regular matching fields | ✔ |regexes_match_parse_html: <br> - regex: regex <br> version: 1 <br> fields: <br> field_name: <br> selects: <br> ... <br> field_name: <br> selects: <br> ... | Map |
#### [Parse html code, more...](https://github.com/ptechen/Capricorn/blob/main/src/lib.rs)
let yml = read_file("./test_html/test.yml").unwrap();
let params: parse::HashMapSelectParams = serde_yaml::from_str(&yml).unwrap();
let html = read_file("./test_html/test.html").unwrap();
let r = parse::parse_html(¶ms, &html);
#### [Multi-version regular matching parsing html code, more...](https://github.com/ptechen/Capricorn/blob/main/src/lib.rs)
let yml = read_file("./test_html/regexes_match_parse_html.yml").unwrap();
let v: match_html::MatchHtmlVec = serde_yaml::from_str(&yml).unwrap();
let html = read_file("./test_html/test.html").unwrap();
let r = v.regexes_match_parse_html(html)?;