Expand description
The purpose of this crate is to download a web page, then download its linked image, Javascript, and CSS resources and embed them in the HTML.
Both async and blocking APIs are provided, making use of reqwest
’s
support for both. The blocking APIs are enabled with the blocking
feature.
§Examples
§Async
use web_archive::archive;
// Fetch page and all its resources
let archive = archive("http://example.com", Default::default())
.await
.unwrap();
// Embed the resources into the page
let page = archive.embed_resources();
println!("{}", page);
§Blocking
use web_archive::blocking;
// Fetch page and all its resources
let archive =
blocking::archive("http://example.com", Default::default()).unwrap();
// Embed the resources into the page
let page = archive.embed_resources();
println!("{}", page);
§Ignore certificate errors (dangerous!)
use web_archive::{archive, ArchiveOptions};
// Fetch page and all its resources
let archive_options = ArchiveOptions {
accept_invalid_certificates: true,
..Default::default()
};
let archive = archive("http://example.com", archive_options)
.await
.unwrap();
// Embed the resources into the page
let page = archive.embed_resources();
println!("{}", page);
Re-exports§
pub use error::Error;
pub use page_archive::PageArchive;
pub use parsing::ImageResource;
pub use parsing::Resource;
pub use parsing::ResourceMap;
pub use parsing::ResourceUrl;
Modules§
- blocking
- Blocking
- error
- Module for the error parsing functionality
- page_
archive - Module for the core archiving functionality
- parsing
- Module for the core parsing functionality
Structs§
- Archive
Options - Configuration options to control aspects of the archiving behaviour.
Functions§
- archive
- The async archive function.