nosy
nosy is a CLI tool that fetches content from URLs or local files, extracts text using appropriate extractors, and summarizes it using LLMs. It supports multiple formats including HTML, PDF, Office documents, audio/video, and more.
Features
- Input from URL or local file
- Supports HTTP GET and headless browser for URLs
- Select executior automatically based on MIME type and file extension, or specify manually
- Supports various extractors:
- Plain Text
- HTML
- Pandoc (docx, doc, odt, rtf, epub, latex, etc.)
- Whisper (mp3, wav, mp4, m4a, etc.)
- means transcribe audio to text
- Supports major LLM providers (OpenAI, Anthropic, Gemini, etc.)
- Customize output tone via templates
- Tab-complete support for bash, zsh, fish, and so on.
Installation
shell
|
Homebrew
Cargo
# Or from source (at the repository root)
Usage
[!NOTE] Running
nosyalone is equivalent tonosy summarize.
Examples
# Summarize a web article
# Summarize a local PDF file
# Summarize a local PDF file in Japanese
# Summarize using a specific LLM model (provider will be inferred)
Auxiliaries
This CLI's main use case is summarization via nosy (summarize),
but it also provides several auxiliary subcommands to support that.
extract
Extract text suitable for LLM input based on the input information.
completion
Generate shell completion scripts.
# Enable zsh completion for the current shell
# Persist by appending to ~/.zshrc
download-whisper
This subcommand downloads Whisper Models used for text extraction from audio/video files.
After downloading, set WHISPER_MODEL_PATH to use it for extraction.
Options
)
)
)
)
)
)
)
)
)
Environment Variables
LLM API Keys
If --provider is specified, the default key for that provider is used.
If omitted, the default key for the inferred provider from the model name is used.
LLM provider API keys are obtained from the following environment variables:
- e.g.
OPENAI_API_KEY,ANTHROPIC_API_KEY,XXXX_API_KEY(for other supported providers, see the full list in the help output)
Path to Whisper Model
WHISPER_MODEL_PATH is used to specify the path to the Whisper model file for audio/video text extraction.
Templates
Templates are written in Handlebars. The default templates are located in assets/.
The following variables are available in each template:
- System template
{{language}}: Language for the summary
- User template
{{content}}: Extracted content to be summarized
Flowchart to Summarization
flowchart TD
A[Detect scheme]
A --> B[Fetch contents]
B --> C[Detect extractor]
C --> D[Extract LLM input]
D --> E[Summarize]
- Detect scheme from input (URL or local file)
- Fetch contents based on the scheme
- Detect extractor based on MIME type and file extension (or use forced extractor if specified)
- Extract LLM input using the selected extractor
- Summarize using the specified LLM model and templates
Capabilities
This section describes
- what input formats are supported
- how text is extracted from data
- what LLM providers are available
in nosy
Inputs
- HTTP/HTTPS URLs
- Local files (
/orfile://)
Extractors (auto-detected)
- Plain Text
- (Pass-through input as-is)
- HTML (built-in)
- PDF (built-in)
- Pandoc (for docx, doc, odt, rtf, epub, latex, ...)
- Require pandoc command installed
- Whisper (for mp3, wav, mp4, m4a, ...)
- Require Whisper model file specified by
WHISPER_MODEL_PATH
- Require Whisper model file specified by
LLM providers
See the help output for the full list of supported LLM providers (i.e., nosy summarize --help).