Struct mlscraper_rust::search::TrainingResult
source · pub struct TrainingResult { /* private fields */ }
Expand description
The result of “training” the fuzzer on a set of web pages. Contains the selectors for each attribute, as well as the original settings used. If training for a particular attribute failed, the attribute/selector pair will be not present in this object.
This result can also be used to extract data from previously unseen documents, for example:
let mut dom = result.parse(&new_page).expect("parse");
let attribute_1 = result.get_value(&dom, "attribute_1_name").expect("get_value");
let attribute_2 = result.get_value(&dom, "attribute_2_name").expect("get_value");
// ...
Enable the “serde” feature to enable serialization/deserialization using serde. This can be useful for reusing previously computed training results.
Implementations§
source§impl TrainingResult
impl TrainingResult
pub fn selectors(&self) -> &HashMap<String, Selector>
pub fn attributes<'a>(&'a self) -> Box<dyn Iterator<Item = &'a str> + 'a>
sourcepub fn parse<'s>(&self, document: &'s str) -> Result<VDom<'s>>
pub fn parse<'s>(&self, document: &'s str) -> Result<VDom<'s>>
Parse a document and return the DOM object.
Calling this and reusing the DOM object is more efficient than calling TrainingResult::parse_and_get_value
multiple times.
sourcepub fn parse_and_get_value(
&self,
document: &str,
attribute_name: &str
) -> Result<Option<String>>
pub fn parse_and_get_value( &self, document: &str, attribute_name: &str ) -> Result<Option<String>>
Parse a document and return the value of the given attribute.
This is equivalent to calling TrainingResult::parse
and then TrainingResult::get_value
.
sourcepub fn get_value<'a>(
&self,
dom: &'a VDom<'a>,
attribute_name: &str
) -> Result<Option<String>>
pub fn get_value<'a>( &self, dom: &'a VDom<'a>, attribute_name: &str ) -> Result<Option<String>>
Get the value of the given attribute from the given DOM object.
sourcepub fn get_selector<'a>(&'a self, attribute_name: &str) -> Option<&'a str>
pub fn get_selector<'a>(&'a self, attribute_name: &str) -> Option<&'a str>
Get the best selector for the given attribute.
sourcepub fn highlight_selections_with_red_border(&self, dom: &mut VDom<'_>) -> String
pub fn highlight_selections_with_red_border(&self, dom: &mut VDom<'_>) -> String
Highlight the selected elements for the given attribute in the given DOM object by adding a red border around them.
This will both alter the input DOM and return the resulting HTML as String, which, as I realize writing this, may be a poor design choice. TODO.
Example: ´´´ let out_html = training_result.highlight_selections_with_red_border(&mut dom); fs::write(“out.html”, out_html).expect(“write”); ´´´