Requests2
A Rust library the crate help you Write a function similar to that of Python's request repository ([Python] BS4 library).
-
Each new requests initializes a cache instance which stores the parsed data in key value pairs
-
When you get an instance of connect, you can call the parser method to parse the data in the form of closure
Version 0.1.3 Update Intro
parserselect function support a lot css selection syntax, this is mainly to simplify the work of parsing the dom- Use free_parser and free_select function parse dom, but these function not return
Value, so not required someValuetype data
| Support Css Selector List |
|---|
| .class.class |
| .class |
| #id |
| element.class |
| element element |
| [attr=value] |
| [attr~value] |
| element |
Add #[derive(DBfile)] to struct, you can use func DBStore::to_csv put data in a csv file
Example code
let data = new;
let client = new;
let rq = client.connect;
rq.free_parse;
Open links.csv view result:
href,link_name
http://news.qq.com/,新闻
https://v.qq.com/?isoldly=1,视频
http://gongyi.qq.com/,公益
https://new.qq.com/ch/milite/,军事
https://sports.qq.com/,体育
https://sports.qq.com/nba/,NBA
https://new.qq.com/ch/ent/,娱乐
https://new.qq.com/ch/finance/,财经
https://new.qq.com/ch/tech/,科技
https://new.qq.com/ch/fashion/,时尚
https://new.qq.com/ch/auto/,汽车
http://house.qq.com/,房产
https://new.qq.com/ch/edu/,教育
https://new.qq.com/ch/cul/,文化
https://new.qq.com/ch/astro/,星座
https://new.qq.com/ch/games/,游戏
http://book.qq.com/,文学
https://v.qq.com/tv/,热剧
https://new.qq.com/ch/antip/,抗肺炎
http://new.qq.com/ch/history/,历史
http://sports.qq.com/premierleague/,英超
http://sports.qq.com/cba/,CBA
https://new.qq.com/ch2/star,明星
https://new.qq.com/ch/finance_licai/,理财
https://new.qq.com/ch/kepu/,科普
https://new.qq.com/ch/health/,健康
https://auto.qq.com/car_public/index.shtml,车型
http://www.jia360.com,家居
https://new.qq.com/ch/baby/,育儿
https://new.qq.com/ch/emotion/,情感
https://new.qq.com/ch/comic/,动漫
https://new.qq.com/omv/,享看
http://tianqi.qq.com/index.htm,天气
https://new.qq.com/omn/author/5107513,较真
https://v.qq.com/channel/variety,综艺
https://new.qq.com/ch/cul_ru/,新国风
https://new.qq.com/ch/world/,国际
http://sports.qq.com/csocce/csl/,中超
http://fans.sports.qq.com/#/,社区
http://v.qq.com/movie/,电影
https://new.qq.com/ch/finance_stock/,证券
https://new.qq.com/ch/digi/,数码
https://new.qq.com/ch2/makeup,美容
https://new.qq.com/ch/topic/,话题
https://new.qq.com/ch/life/,生活
http://kid.qq.com/,儿童
http://www.qq.com/map/,全部
Example
let data = new;
let client = new;
let mut rq = client.connect;
rq.parser; //
data.print
Use data.print you can view the value stored as the [href] key. It is a value enumeration type that contains most data types.
- output
Key -- "href" Value -- LIST
Headers
Headers defines three types of request headers. The default Header::default has only one user agent, or it can be without any Headers::None.also use JSON string to make a request header containing useragent and host, this code:
let headers = r#"{"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", "host": "www.qq.com"}"#;
let store = new;
let client = new;
let mut p = client.connect;
If you need more request header fields, you need to use add corresponding fields in headers.rs
Parser
When you use the connect method to connect to a URL, you can use the parser method to write the work of parsing HTML, the current parser has find and find_ all method
the parser must return Value type
If you use find find_all the code:
rq.parser
The first parameter of parser is obtained by closure automatically saves the value to value:: list
In general, you may need to handle the parsing manually and customize the returned Value, please use p.select function , this example code:
let data = new;
let client = new;
let mut parser = client.connect;
parser.parser;
data.print;
Value
Add the data type you need in the Value.rs
Concurrency support
use rayon library test the concurrency, this have a simple code:
let data = new;
let client = new;
let urls = ;
let _ = urls.par_iter.map
.map.;
match data.get ;
if let STR = data.get
if let STR = data.get