code2prompt
:computer::memo:
A command-line tool to generate an LLM prompt from codebases of any size, fast.
With the advent of LLMs with multi-million context windows, it's just a matter of copy-pasting entire codebases into your prompts.
Table of Contents
Features
- Quickly generate LLM prompts from codebases of any size.
- Customize prompt generation with Handlebars templates. (See the default template)
- Follows
.gitignore
. - Filter and exclude files by extension.
- Display the token count of the generated prompt. (See Tokenizer for more details)
- Copy the generated prompt to the clipboard on generation.
- Save the generated prompt to an output file.
Installation
Download the latest binary for your OS from Releases OR install with cargo
:
$ cargo install code2prompt
Usage
Generate a prompt from a codebase directory:
code2prompt path/to/codebase
Use a custom Handlebars template file:
code2prompt path/to/codebase -t path/to/template.hbs
Filter files by extension:
code2prompt path/to/codebase -f rs,toml
Exclude files by extension:
code2prompt path/to/codebase -e txt,md
Display token count of the generated prompt:
code2prompt path/to/codebase --tokens
Specify tokenizer for token count:
code2prompt path/to/codebase --tokens -e p50k
Supported tokenizers: c100k
, p50k
, p50k_edit
, r50k_base
.
[!NOTE]
See Tokenizers for more details.
Save the generated prompt to an output file:
$ code2prompt path/to/codebase -o output.txt
Tokenizers
Tokenization is implemented using tiktoken-rs
. tiktoken
supports these encodings used by OpenAI models:
Encoding name | OpenAI models |
---|---|
cl100k_base |
ChatGPT models, text-embedding-ada-002 |
p50k_base |
Code models, text-davinci-002 , text-davinci-003 |
p50k_edit |
Use for edit models like text-davinci-edit-001 , code-davinci-edit-001 |
r50k_base (or gpt2 ) |
GPT-3 models like davinci |
For more context on the different tokenizers, see the OpenAI Cookbook
How is it useful?
code2prompt
makes it easy to generate prompts for LLMs from your codebase. It traverses the directory, builds a tree structure, and collects information about each file. You can customize the prompt generation using Handlebars templates. The generated prompt is automatically copied to your clipboard and can also be saved to an output file. code2prompt
helps streamline the process of creating LLM prompts for code analysis, generation, and other tasks.
Build From Source
Prerequisites
For building code2prompt
from source, you need to have these tools installed:
$ git clone https://github.com/mufeedvh/code2prompt.git
$ cd code2prompt/
$ cargo build --release
The first command clones the code2prompt
repository to your local machine. The next two commands change into the code2prompt
directory and build it in release mode.
Contribution
Ways to contribute:
- Suggest a feature
- Report a bug
- Fix something and open a pull request
- Help me document the code
- Spread the word
License
Licensed under the MIT License, see LICENSE for more information.
Liked the project?
If you liked the project and found it useful, please give it a :star: and consider supporting the author!