Crate gitbook2text

Crate gitbook2text 

Source
Expand description

§gitbook2text

A library and CLI tool to download GitBook pages and convert them into markdown and plain text.

§Exemples

§Crawling a GitBook

use gitbook2text::{is_gitbook, extract_gitbook_links};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url = "https://docs.example.com";

    if is_gitbook(url).await? {
        let links = extract_gitbook_links(url).await?;
        println!("Trouvé {} pages", links.len());
    }
    Ok(())
}

§Download and conversion

use gitbook2text::{download_page, markdown_to_text, txt_sanitize};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url = "https://example.com/page.md";
    let content = download_page(url).await?;
    let text = markdown_to_text(&content);
    let cleaned = txt_sanitize(&text);
    println!("{}", cleaned);
    Ok(())
}

Enums§

GitBookError

Functions§

crawl_and_save
Extracts links from a GitBook and saves them to a file
download_page
Download the content of a page from a URL
extract_gitbook_links
Extracts all documentation links from a GitBook site
is_gitbook
Checks if a URL points to a GitBook site
markdown_to_text
Converts markdown to plain text
save_markdown
Save the markdown content to a file
save_text
Saves the text content to a file
txt_sanitize
Cleans and sanitizes the text by removing special GitBook tags
url_to_filename
Converts a URL into a safe filename