# Requests2
A Rust library the crate help you Write a function similar to that of Python's request repository (**[`Python`]** **BS4** library).
- Each new requests initializes a cache instance which stores the parsed data in key value pairs
- When you get an instance of connect, you can call the parser method to parse the data in the form of closure
## Version 0.1.3 Update Intro
- `parser` select function support a lot css selection syntax, this is mainly to simplify the work of parsing the dom
- Use free_parser and free_select function parse dom, but these function not return `Value`, so not required some `Value` type data
| .class.class |
| .class |
| #id|
|element.class|
|element element|
|[attr=value]|
|[attr~value]|
|element|
Add `#[derive(DBfile)]` to struct, you can use func `DBStore::to_csv` put data in a csv file
### Example code
```rust
let data = Cache::new();
let client = Requests::new(&data);
let rq = client.connect("https://www.qq.com/", Headers::Default);
#[derive(DBfile, Debug)]
struct Link<'a> {
href: &'a str,
link_name: String,
}
rq.free_parse(|mut p| {
p.free_select("li.nav-item a",|n| {
let links = n.iter().map(|x| {
Link { href: x.attr("href").expect("extra href error"), link_name: x.text() }
}).collect::<Vec<Link>>();
DBStore::to_csv(links, "D:\\links.csv", "a", true);
});
});
```
Open `links.csv` view result:
```rust
href,link_name
http://news.qq.com/,新闻
https://v.qq.com/?isoldly=1,视频
http://gongyi.qq.com/,公益
https://new.qq.com/ch/milite/,军事
https://sports.qq.com/,体育
https://sports.qq.com/nba/,NBA
https://new.qq.com/ch/ent/,娱乐
https://new.qq.com/ch/finance/,财经
https://new.qq.com/ch/tech/,科技
https://new.qq.com/ch/fashion/,时尚
https://new.qq.com/ch/auto/,汽车
http://house.qq.com/,房产
https://new.qq.com/ch/edu/,教育
https://new.qq.com/ch/cul/,文化
https://new.qq.com/ch/astro/,星座
https://new.qq.com/ch/games/,游戏
http://book.qq.com/,文学
https://v.qq.com/tv/,热剧
https://new.qq.com/ch/antip/,抗肺炎
http://new.qq.com/ch/history/,历史
http://sports.qq.com/premierleague/,英超
http://sports.qq.com/cba/,CBA
https://new.qq.com/ch2/star,明星
https://new.qq.com/ch/finance_licai/,理财
https://new.qq.com/ch/kepu/,科普
https://new.qq.com/ch/health/,健康
https://auto.qq.com/car_public/index.shtml,车型
http://www.jia360.com,家居
https://new.qq.com/ch/baby/,育儿
https://new.qq.com/ch/emotion/,情感
https://new.qq.com/ch/comic/,动漫
https://new.qq.com/omv/,享看
http://tianqi.qq.com/index.htm,天气
https://new.qq.com/omn/author/5107513,较真
https://v.qq.com/channel/variety,综艺
https://new.qq.com/ch/cul_ru/,新国风
https://new.qq.com/ch/world/,国际
http://sports.qq.com/csocce/csl/,中超
http://fans.sports.qq.com/#/,社区
http://v.qq.com/movie/,电影
https://new.qq.com/ch/finance_stock/,证券
https://new.qq.com/ch/digi/,数码
https://new.qq.com/ch2/makeup,美容
https://new.qq.com/ch/topic/,话题
https://new.qq.com/ch/life/,生活
http://kid.qq.com/,儿童
http://www.qq.com/map/,全部
```
## Example
```rust
let data = Cache::new();
let client = Requests::new(&data);
let mut rq = client.connect("https://www.qq.com/", Headers::Default);
x.attr("href").map_or(false, |v| v.starts_with("http"))
}, "href")
}, "href"); //
data.print()
```
Use `data.print` you can view the value stored as the [`href`] key. It is a value enumeration type that contains most data types.
- output
```rust
Key -- "href" Value -- LIST(["https://qzone.qq.com", "https://qzone.qq.com", "https://mail.qq.com", "https://mail.qq.com/cgi-bin/loginpage", "http://news.qq.com/", "https://v.qq.com/?isoldly=1", "http://gongyi.qq.com/",...]
```
## Headers
Headers defines three types of request headers. The default `Header::default` has only one `user agent`, or it can be without any `Headers::None`.also use JSON string to make a request header containing `useragent` and `host`, this code:
```rust
let headers = r#"{"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", "host": "www.qq.com"}"#;
let store = Cache::new();
let client = Requests::new(&store);
let mut p = client.connect("https://www.qq.com", Headers::JSON(headers));
```
If you need more request header fields, you need to use add corresponding fields in [`headers.rs`](https://docs.rs/requests2/0.1.2/requests2/headers/enum.Headers.html)
## Parser
When you use the `connect` method to connect to a URL, you can use the parser method to write the work of parsing HTML, the current parser has `find` and `find_ all` method
the parser must return [`Value`](https://docs.rs/requests2/0.1.2/requests2/value/enum.Value.html) type
If you use find find_all the code:
```rust
x.attr("href").map_or(false, |v| v.starts_with("http"))
}, "href")
}, "href")
```
The first parameter of parser is obtained by closure automatically saves the value to `value:: list`
In general, you may need to handle the parsing manually and customize the returned `Value`, please use `p.select` function , this example code:
```rust
let data = Cache::new();
let client = Requests::new(&data);
let mut parser = client.connect("https://www.qq.com", Headers::Default);
parser.parser(|p| {
let mut result = HashMap::new();
let navs = p.select("li.nav-item", |nodes| {
let navs = nodes.into_iter().map(|n| {
let mut item = HashMap::new();
n.find(Name("a")).next().map_or(HashMap::from([("".to_string(), Value::NULL)]), |a| {
let nav_name = a.text();
let nav_href = a.attr("href").map_or(String::from(""), |x| x.to_string());
item.insert("nav_name".to_string(), Value::STR(nav_name));
item.insert("nav_href".to_string(), Value::STR(nav_href));
item
})
}).collect::<Vec<HashMap<String, Value>>>();
Value::VECMAP(navs)
});
let news = p.select("ul.yw-list", |nodes| {
let mut news = Vec::new();
for node in nodes {
for n in node.find(Class("news-top")) {
for a in n.find(Name("a")) {
let title = a.text();
news.push(title);
}
}
}
Value::LIST(news)
});
result.insert("titles".to_owned(), news);
result.insert("nav".to_owned(), navs);
Value::MAP(result)
}, "index");
data.print();
```
## Value
```rust
pub enum Value {
/// 字符串类型
STR(String),
/// 字符串列表
LIST(Vec<String>),
INT(i32),
/// 空数据
NULL,
/// bool
BOOL(bool),
/// map类型的列表
VECMAP(Vec<HashMap<String, Value>>),
/// map类型
MAP(HashMap<String, Value>)
}
```
Add the data type you need in the **Value.rs**
## Concurrency support
use `rayon` library test the concurrency, this have a simple code:
```rust
let data = Cache::new();
let client = Requests::new(&data);
let urls = ["https://www.baidu.com", "https://www.qq.com", "https://www.163.com"];
let _ = urls.par_iter().map(|url| {
let mut p = client.connect(url, Headers::Default);
p.parser(|p| {
p.find_all("a", |f| f.attr("href").map_or(false, |v| v.starts_with("http://")), "href")
}, format!("{}_link", url).as_str());
p.parser(|p| {
p.find("title", |f| f.text() != "", "text")
}, format!("{}_title", url).as_str());
})
.map(|_| String::from("")).collect::<String>();
match data.get("https://www.qq.com_title") {
Value::STR(i) => assert_eq!(i, "腾讯首页"),
_ => panic!("")
};
if let Value::STR(i) = data.get("https://www.163.com_title") {
assert_eq!(i, "网易");
}
if let Value::STR(i) = data.get("https://www.baidu.com_title") {
assert_eq!(i, "百度一下,你就知道");
}
```