Expand description
fav_core is the core library of fav_cli (A cli tool to download remote resources and keep a local state in protobuf). In simple words, fav_core is a helper to build a stateful crawler.
§Usage
fav_utils provides the utils for fav_cli, which now only support BiliBili(Like Chinese YouTube). You can see it as an example for using this crate.
To save status, instead of using json, this crate uses protobuf since it is faster. You need to define data structures with protobuf like this example (To derive trait for code generated by protobuf, see example).
Sets contains Sets, Set contains Ress(resource). The workflow is:
- fetch
Setsto refreshSets - fetch
Setto refreshRess - fetch and pull
Resto download
To implement this workflow and maintain a local state, fav_core has many useful traits:
- network helper
Api: help defining the APIsApiProvider: make app able to provide API based onApiKindenumNet: make app able to use the Internet
- Config
-
Config: HttpConfig + ProtoLocalmark the app able to be config and persisted -
HttpConfig: define the default headers, cookies
- Status and attributes
Sets: iterate over and get subset of setsSet: iterate over and get subset of resourcesRes: MetaMeta: the metadata of resource,Meta: Attr + StatusAttr: provide resource’s id and titleStatus: the status of resource, like saved, fetched, tracked and expired
- Operations
Ops:Ops: AuthOps + SetsOps + SetOps + ResOps, means the app can perform all needed operationsAuthOps: used to login and logoutSetsOps: used tofetch_setsinfo, for example, addEnglishChineseJapaneseas new movie collections toSetsdefined in protobuf.SetOps: used tofetch_setinfo, for example, add 《Oliver Twist》《Roman Holiday》《Twelve Angry Men》toEnglishcollection.ResOps: used tofetchandpull, for example,fetchid of 《Oliver Twist》 in target website,pullthe resources to local disk based on the fetched id.
- Persistence
PathInfo: defined where to store status and configProtoLocal:ProtoLocal: PathInfo + MessageFullused to read and write status and configSaveLocal: make app able to downloadRes, and modify local status.
- visualize (optional): show status as table
- Ext methods:
SetOpsExt: SetOpsbatch fetch set in setsResOpsExt: ResOpsbatch fetch resources in setXXStatusExt: batch modify children’s StatusFlags
To draw a conclusion, this crate contains all traits you need to build a stateful crawler. You can define data structures with protobuf for fast read and write. Make them stateful, configurable, and able to be persisted. Many network helper is provided, you can request_json and resquest_protobuf directly. And Ext traits are provided so that you can batch fetch and pull data or modify the resources’ StatusFlags.
An example can be found in fav repo.
§CHANGELOG
- 0.1.1 -> 0.1.2:
XXOpsExtneedsbatch_sizepassed so that users can define the number of jobs concurrently. - 0.0.X -> 0.1.X:
Opsrelated traits’ methods needFut: Future<...>, if Future is ready, one can cleanup, shutdown gracefully and returnFavCoreError::Cancel. AndOpsExtmethods handle SIGINT based on this, keeps things reliable.
Re-exports§
pub use error::*;
Modules§
- api
- API, making the api easy to use
- attr
- Attribute, managing the resources’ attributes
- config
- Config, helping managing the configuration
- error
- Core error
- local
- Local, helping persisting protobuf
- meta
- Meta, making resource completely able to be operated
- ops
- The
Operationstrait, making app able to perform more operations - prelude
- Re-export the most important traits and types
- remote
- Remote trait for remote operations
- res
- Relations between resources, resource sets, and uppers
- status
- Status of resource
- visual
visual - Data visualize