Crate yammer

Expand description

§yammer

Yammer provides asynchronous bindings to the Ollama API and the following CLI tools:

shellm pass a file (or stdin if no file) to the generate endpoint and stream the result.
oneshot open a temporary file in an editor to be passed to the generate endpoint; stream the result.
prompt pass a prompt to the generate endpoint and stream the result.
chat chat with a model using the chat endpoint.
chats manage chat sessions.

§Installation

$ cargo install yammer

§Usage

The shellm tool multiplexes files over a model:

$ shellm --model llama3.2:3b << EOF
Why is the sky red?
EOF
I'm sorry.  The sky is not red.
$ shellm --model llama3.2:3b foo bar
Response to foo...
Response to bar...

The oneshot tool is conceptually the same as editing a temporary file and passing it to shellm:

$ oneshot llama3.2:3b gemma2
Opens $EDITOR with a temporary file.  Write your prompt and save the file.
Output of llama3.2:3b...
Output of gemma2....

The prompt tool is similar to shellm but takes prompts on the command line rather than files:

$ prompt llama3.2:3b "Why is the sky red?"
I'm sorry.  The sky is not red.

The chat command is used to chat with a model:

$ chat
>>> Why is the sky red?
The sky often appears red at sunrise and sunset. ...
>>> :edit
>>> :model llama3.2:3b
>>> :retry
The sky often appears red at sunrise and sunset due to Rayleigh scattering. ....
>>> :param --num-ctx 4096
>>> :exit

The chats command is used to manage chat sessions:

$ chats
recent:
2024-12-01T18:26 FP8MC gemma2              Why is the sky red?
2024-12-01T17:34 H5HMV llama3.2:3b         Hi there!  Tell me about first and follow sets for parsers.
> pin FP8MC
> status
pinned:
2024-12-01T18:29 FP8MC gemma2              Why is the sky red?

recent:
2024-12-01T17:34 H5HMV llama3.2:3b         Hi there!  Tell me about first and follow sets for parsers.
> archive H5HMV
> status
pinned:
2024-12-01T18:29 FP8MC gemma2              Why is the sky red?
> chat FP8MC
>>> Why is the sky red?
The sky often appears red at sunrise and sunset. ...
>>> exit
> new "Act like Mario, the video game character."
>>> Hi!
Hiya!  It'sa me, Mario!
>>> exit
> exit

§Help

§shellm

$ shellm --help
USAGE: shellm [OPTIONS] [FILE]

Options:
    -h, -help           Print this help menu.
        -ollama-host    The host to connect to.
        -model          The model to use from the ollama library.
        -suffix         The suffix to append to the response.
        -system         The system to use in the template.
        -template       The template to use for the prompt.
        -json           Format the response in JSON. You must also ask the
                        model to do so.
        -raw            Whether to pass bypass formatting of the prompt.
        -keep-alive     Duration to keep the model in memory for after the
                        call.
        -param-mirostat
                        Enable Mirostat sampling for controlling perplexity.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat
                        2.0)
        -param-mirostat-eta
                        Influences how quickly the algorithm responds to
                        feedback from the generated text.
        -param-mirostat-tau
                        Controls the balance between coherence and diversity
                        of the output.
        -param-num-ctx  The number of tokens worth of context to allocate.
        -param-repeat-last-n
                        Sets how far back for the model to look back to
                        prevent repetition.
        -param-repeat-penalty
                        Sets how strongly to penalize repetitions.
        -param-temperature
                        The temperature of the model.
        -param-seed     Sets the random number seed to use for generation.
        -param-tfs-z    Tail free sampling is used to reduce the impact of
                        less probable tokens from the output.
        -param-num-predict
                        Maximum number of tokens to predict when generating
                        text.
        -param-top-k    Reduces the probability of generating nonsense.
        -param-top-p    Works together with top-k.
        -param-min-p    Alternative to the top_p, and aims to ensure a balance
                        of quality and variety.

§oneshot

$ oneshot --help
USAGE: oneshot [OPTIONS] [MODEL]

Options:
    -h, -help           Print this help menu.
        -ollama-host    The host to connect to.
        -suffix         The suffix to append to the response.
        -system         The system to use in the template.
        -template       The template to use for the prompt.
        -json           Format the response in JSON. You must also ask the
                        model to do so.
        -raw            Whether to pass bypass formatting of the prompt.
        -keep-alive     Duration to keep the model in memory for after the
                        call.
        -param-mirostat
                        Enable Mirostat sampling for controlling perplexity.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat
                        2.0)
        -param-mirostat-eta
                        Influences how quickly the algorithm responds to
                        feedback from the generated text.
        -param-mirostat-tau
                        Controls the balance between coherence and diversity
                        of the output.
        -param-num-ctx  The number of tokens worth of context to allocate.
        -param-repeat-last-n
                        Sets how far back for the model to look back to
                        prevent repetition.
        -param-repeat-penalty
                        Sets how strongly to penalize repetitions.
        -param-temperature
                        The temperature of the model.
        -param-seed     Sets the random number seed to use for generation.
        -param-tfs-z    Tail free sampling is used to reduce the impact of
                        less probable tokens from the output.
        -param-num-predict
                        Maximum number of tokens to predict when generating
                        text.
        -param-top-k    Reduces the probability of generating nonsense.
        -param-top-p    Works together with top-k.
        -param-min-p    Alternative to the top_p, and aims to ensure a balance
                        of quality and variety.

§prompt

$ prompt --help
USAGE: prompt [OPTIONS] [PROMPT]

Options:
    -h, -help           Print this help menu.
        -ollama-host    The host to connect to.
        -model          The model to use from the ollama library.
        -suffix         The suffix to append to the response.
        -system         The system to use in the template.
        -template       The template to use for the prompt.
        -json           Format the response in JSON. You must also ask the
                        model to do so.
        -raw            Whether to pass bypass formatting of the prompt.
        -keep-alive     Duration to keep the model in memory for after the
                        call.
        -param-mirostat
                        Enable Mirostat sampling for controlling perplexity.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat
                        2.0)
        -param-mirostat-eta
                        Influences how quickly the algorithm responds to
                        feedback from the generated text.
        -param-mirostat-tau
                        Controls the balance between coherence and diversity
                        of the output.
        -param-num-ctx  The number of tokens worth of context to allocate.
        -param-repeat-last-n
                        Sets how far back for the model to look back to
                        prevent repetition.
        -param-repeat-penalty
                        Sets how strongly to penalize repetitions.
        -param-temperature
                        The temperature of the model.
        -param-seed     Sets the random number seed to use for generation.
        -param-tfs-z    Tail free sampling is used to reduce the impact of
                        less probable tokens from the output.
        -param-num-predict
                        Maximum number of tokens to predict when generating
                        text.
        -param-top-k    Reduces the probability of generating nonsense.
        -param-top-p    Works together with top-k.
        -param-min-p    Alternative to the top_p, and aims to ensure a balance
                        of quality and variety.

§chat

$ chat --help
USAGE: chat [OPTIONS]

Options:
    -h, -help           Print this help menu.
        -ollama-host    The host to connect to.
        -model          The model to use from the ollama library.
        -keep-alive     Duration to keep the model in memory for after the
                        call.
        -param-mirostat
                        Enable Mirostat sampling for controlling perplexity.
                        (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat
                        2.0)
        -param-mirostat-eta
                        Influences how quickly the algorithm responds to
                        feedback from the generated text.
        -param-mirostat-tau
                        Controls the balance between coherence and diversity
                        of the output.
        -param-num-ctx  The number of tokens worth of context to allocate.
        -param-repeat-last-n
                        Sets how far back for the model to look back to
                        prevent repetition.
        -param-repeat-penalty
                        Sets how strongly to penalize repetitions.
        -param-temperature
                        The temperature of the model.
        -param-seed     Sets the random number seed to use for generation.
        -param-tfs-z    Tail free sampling is used to reduce the impact of
                        less probable tokens from the output.
        -param-num-predict
                        Maximum number of tokens to predict when generating
                        text.
        -param-top-k    Reduces the probability of generating nonsense.
        -param-top-p    Works together with top-k.
        -param-min-p    Alternative to the top_p, and aims to ensure a balance
                        of quality and variety.

§chats

$ chats
> help
chats
=====

Commands:

status      Show the status of all chats.
archive     Archive a chat.
unarchive   Unarchive a chat.
archived    Show all archived chats.
pin         Pin a chat.
unpin       Unpin a chat.
pinned      Show all pinned chats.
new         Start a new chat.
chat        Continue a chat.
editor      Start a chat with a system message written in EDITOR.

§Status

Active development.

§Documentation

The latest documentation is always available at docs.rs.

Structs§

Chat: The chat command.
ChatMessage: A message sent or received in a chat.
ChatOptions: Options for the chat command.
ChatRequest: A request to chat with a model.
ChatResponse: A response to a chat request.
Chats: The chats command.
ChatsOptions: CommandLine options for the chats command.
EmbedRequest: A request to embed multiple input documents.
EmbedResponse: A response to an embed response.
GenerateRequest: Generate a response to a prompt.
GenerateResponse: A response to a generate request.
OneshotOptions: Options for the oneshot command.
Parameters: Parameters for the model.
PromptOptions: Options for the prompt command.
ShellmOptions: Options for the shellm command.
Spinner: A spinner widget.
ToolBuilder: Build a tool for use in chat completions.
WordWrap: A word-wrapping struct.

Enums§

Error: An error that can occur when interacting with the ollama API.

Constants§

OLLAMA_HOST: The default host to connect to.

Traits§

Formattable: Formattable requires that a type implement Display and serde::Serialize. In exchange, it can be put into a prompt in plaintext, json, or xml form.
JsonSchema: Implement JsonSchema to derive the schema for GenerateRequest automatically.

Functions§

chat_path: The path for one specific chat.
chat_root: The root on the filesystem for chats.
chat_shell: Start the chat shell.
chats_shell: Start the chats shell.
editor: Invoke an editor with a default message and return something like a string.
ollama_host: Return the Ollama host, preferring the value passed in, falling back to the env var, falling back to the hard-coded default.
oneshot: The oneshot command.
prompt: The prompt command.
shellm: The shellm command.
stream: Stream the response of a request, calling for_each on each JSON object in the response.

Crate yammerCopy item path