Expand description
Google Docs capture module.
Supports API-based capture of Google Docs documents via the export URL pattern:
https://docs.google.com/document/d/{DOCUMENT_ID}/export?format={FORMAT}
§Supported Export Formats
html— HTML document (images as base64 data URIs)txt— Plain textmd— Markdown (native Google Docs export)pdf— PDF documentdocx— Microsoft Word documentepub— EPUB ebook format
§Example
use web_capture::gdocs;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let url = "https://docs.google.com/document/d/abc123/edit";
if gdocs::is_google_docs_url(url) {
let result = gdocs::fetch_google_doc(url, "html", None).await?;
println!("Content length: {}", result.content.len());
}
Ok(())
}Structs§
- Extracted
Image - An image extracted from base64 data URIs in HTML.
- GDocs
Archive Result - Result of fetching a Google Doc as an archive.
- GDocs
Result - Result of fetching a Google Docs document.
Functions§
- build_
export_ url - Build a Google Docs export URL.
- create_
archive_ zip - Create a ZIP archive from a
GDocsArchiveResult. - extract_
base64_ images - Extract base64 data URI images from HTML content.
- extract_
bearer_ token - Extract a Bearer token from an Authorization header value.
- extract_
document_ id - Extract the document ID from a Google Docs URL.
- fetch_
google_ doc - Fetch a Google Docs document via the export URL.
- fetch_
google_ doc_ as_ archive - Fetch a Google Docs document as a ZIP archive.
- fetch_
google_ doc_ as_ markdown - Fetch a Google Docs document and convert to Markdown.
- is_
google_ docs_ url - Check if a URL is a Google Docs document URL.