Skip to main content

Crate autogpt

Crate autogpt 

Source
Expand description

Β§πŸ€– AutoGPT

Work In Progress made-with-rust Rust License Maintenance Jupyter Notebook

Share On Reddit Share On Ycombinator Share On X Share On Meta Share On Linkedin

CircleCI Crates.io Downloads Github Binder Open In Colab

banner

🐧 Linux (Recommended)πŸͺŸ WindowsπŸ‹πŸ‹
Crates.io DownloadsCrates.io DownloadsDockerDocker
linux-demowindows-demo--
Method 1: Download Executable FileDownload .exe File--
Method 2: cargo install autogpt --all-featurescargo install autogpt --all-featuresdocker pull kevinrsdev/autogptdocker pull kevinrsdev/orchgpt
Set Environment VariablesSet Environment VariablesSet Environment VariablesSet Environment Variables
autogpt -h
orchgpt -h
autogpt.exe -hdocker run kevinrsdev/autogpt -hdocker run kevinrsdev/orchgpt -h

[!NOTE] This project is under active development. There is also a parallel project, lmm, under equally active development; It does not use LLMs at all. Instead, it uses equation-based intelligence to predict new words and reason without gradient-trained models. Check it out if you’re interested in a fundamentally different approach to machine intelligence!

AutoGPT is a pure rust framework that simplifies AI agent creation and management for various tasks. Its remarkable speed and versatility are complemented by a mesh of built-in interconnected GPTs, ensuring exceptional performance and adaptability.

§🧠 Framework Overview

Β§βš™οΈ Agent Core Architecture

AutoGPT agents are modular and autonomous, built from composable components:

  • πŸ”Œ Tools & Sensors: Interface with the real world via actions (e.g., file I/O, APIs) and perception (e.g., audio, video, data).
  • 🧠 Memory & Knowledge: Combines long-term vector memory with structured knowledge bases for reasoning and recall.
  • πŸ“ No-Code Agent Configs: Define agents and their behaviors with simple, declarative YAML, no coding required.
  • 🧭 Planner & Goals: Breaks down complex tasks into subgoals and tracks progress dynamically.
  • 🧍 Persona & Capabilities: Customizable behavior profiles and access controls define how agents act.
  • πŸ§‘β€πŸ€β€πŸ§‘ Collaboration: Agents can delegate, swarm, or work in teams with other agents.
  • πŸͺž Self-Reflection: Introspection module to debug, adapt, or evolve internal strategies.
  • πŸ”„ Context Management: Manages active memory (context window) for ongoing tasks and conversations.
  • πŸ“… Scheduler: Time-based or reactive triggers for agent actions.

Β§πŸš€ Developer Features

AutoGPT is designed for flexibility, integration, and scalability:

  • πŸ§ͺ Custom Agent Creation: Build tailored agents for different roles or domains.
  • πŸ“‹ Task Orchestration: Manage and distribute tasks across agents efficiently.
  • 🧱 Extensibility: Add new tools, behaviors, or agent types with ease.
  • πŸ’» CLI Tools: Command-line interface for rapid experimentation and control.
  • 🧰 SDK Support: Embed AutoGPT into existing projects or systems seamlessly.
  • πŸ”€ Mixture of Providers (MoP): Parallel fan-out and weighted scoring across multiple AI backends for optimal response quality.

Β§πŸ“¦ Installation

Please refer to our tutorial for guidance on installing, running, and/or building the CLI from source using either Cargo or Docker.

[!NOTE] For optimal performance and compatibility, we strongly advise utilizing a Linux operating system to install this CLI.

Β§πŸ”„ Workflow

AutoGPT supports 4 modes of operation: interactive, direct prompt, standalone agentic, and distributed agentic.

Β§0. πŸ€– GenericGPT Interactive Mode (Default)

When you run autogpt with no subcommand or flags, it launches an interactive AI shell powered by GenericGPT, a production-hardened autonomous software engineering agent with session persistence, model switching, and multi-provider support:

autogpt

The interactive shell supports the following commands:

CommandDescription
<your prompt>Send a task to the GenericGPT autonomous agent
/helpShow available commands
/providerSwitch AI provider (Gemini, OpenAI, Anthropic, XAI, Cohere)
/modelsBrowse and switch between provider-native models
/sessionsList and resume previous sessions
/statusShow current model, provider, and directory
/workspaceShow the current workspace path
/clearClear the terminal
exit / quitSave session and quit

Press ESC at any time to interrupt a running generation.

Β§πŸ”€ Mixture of Providers (MoP)

AutoGPT introduces a high-availability Mixture of Providers architecture. When enabled via the --mixture or -m flag, every prompt is fanned out concurrently to all configured AI providers (Gemini, OpenAI, etc.). A weighted scoring engine evaluates responses based on:

  1. Length calibration (rewarding detail, penalizing fluff).
  2. Code quality (bonus for language-tagged Markdown blocks).
  3. Structural richness (headings, lists, hygiene).
  4. Reasoning depth (connectivity words and logical flow).
  5. Completeness (punctuation and closing delimiters).

The highest-scored response is selected as the winner and injected into the agent’s context, promoting the best β€œintelligence” available from your configured keys.

Β§The .autogpt Directory

GenericGPT maintains all persistent state inside the workspace root (defaults to the current directory):

.autogpt/
β”œβ”€β”€ sessions/          # Markdown conversation snapshots, auto-saved after every response
β”‚   β”œβ”€β”€ <uuid>.md
β”‚   └── ...
└── skills/            # TOML lesson files, injected into future prompts automatically
    β”œβ”€β”€ rust.toml
    β”œβ”€β”€ web.toml
    └── python.toml

Control the workspace root with AUTOGPT_WORKSPACE:

export AUTOGPT_WORKSPACE=/my/project   # scope all file ops to a specific directory
autogpt

Β§Model Selection

Models are sourced dynamically from each provider’s crate. Override the active model without entering the shell:

export GEMINI_MODEL=gemini-2.5-pro-preview-05-06
export OPENAI_MODEL=gpt-4o
export MODEL=<any-model-id>    # global fallback for any provider

Β§How GenericGPT Works

Each prompt goes through a seven-step pipeline:

  1. MoP Fan-out (optional): Parallel execution across multiple providers.
  2. Reasoning: structured internal monologue stored in the session log.
  3. Task synthesis: decomposition into typed actions (CreateFile, PatchFile, RunCommand, …).
  4. Execution: file edits via PatchFile; shell execution via RunCommand.
  5. Build-and-verify: auto-detects Cargo.toml / package.json / Makefile and runs the build; retries on failure up to 3 times.
  6. Reflection: reviews outcomes and lesson candidates.
  7. Skill extraction: lessons written to .autogpt/skills/<domain>.toml and injected in future sessions.
flowchart TD
    A([User enters prompt]) --> B{Mixture mode?}
    B -- Yes --> C[Run Mixture of Providers]
    B -- No --> D[Standard Provider]
    C & D --> E[Reasoning pre-step]
    E --> F[Task synthesis]
    F --> G{User approves?}
    G -- yolo mode / yes --> H[Execute actions]
    H --> I[Build-and-verify loop]
    I -- pass --> J[Reflection]
    I -- fail, retry ≀3 --> H
    J --> K[Save skills & session]
    K --> L([Ready for next prompt])
flowchart TD
    A([User launches autogpt]) --> B{Any args?}
    B -- No --> C[GenericGPT Interactive Shell]
    B -- Yes --> D{Subcommand}
    C --> E[Select Provider & Model]
    E --> F[Enter Prompt Loop]
    F --> G{Mixture enabled?}
    G -- Yes --> H[Mixture of Providers]
    G -- No --> I[Standard Prompt]
    H & I --> J[Agent Generates Response]
    J --> F
    D -- arch --> K[ArchitectGPT]
    D -- back --> L[BackendGPT]
    D -- front --> M[FrontendGPT]
    D -- design --> N[DesignerGPT]
    D -- manage --> O[ManagerGPT]
    D -- -p prompt --> P[Direct LLM Prompt]

Β§1. πŸ’¬ Direct Prompt Mode

In this mode, you can use the CLI to interact with the LLM directly, no need to define or configure agents. Use the -p flag to send prompts to your preferred LLM provider quickly and easily. Combine with --mixture to get the best answer from all your providers at once.

# Single provider
autogpt -p "Explain the Rust borrow checker in simple terms"

# Mixture of Providers (fanned out)
autogpt -m -p "Implement a Red-Black tree in Rust"

§2. 🧠 Agentic Networkless Mode (Standalone)

In this mode, the user runs an individual autogpt agent directly via a subcommand (e.g., autogpt arch). Each agent operates independently without needing a networked orchestrator.

flowchart TD
    User([User Provides Project Prompt]) --> M[ManagerGPT\nDistributes Tasks]
    M --> B[BackendGPT]
    M --> F[FrontendGPT]
    M --> D[DesignerGPT\nOptional]
    M --> A[ArchitectGPT]
    B --> BL[Backend Logic]
    F --> FL[Frontend Logic]
    D --> DL[Design Assets]
    A --> AL[Architecture Diagram]
    BL & FL & DL & AL --> M2[ManagerGPT\nCollects & Consolidates]
    M2 --> Result([User Receives Final Output])
  • ✍️ User Input: Provide a project’s goal (e.g. β€œDevelop a full stack app that fetches today’s weather. Use the axum web framework for the backend and the Yew rust framework for the frontend.”).
  • πŸš€ Initialization: AutoGPT initializes based on the user’s input, creating essential components such as the ManagerGPT and individual agent instances (ArchitectGPT, BackendGPT, FrontendGPT).
  • πŸ› οΈ Agent Configuration: Each agent is configured with its unique objectives and capabilities, aligning them with the project’s defined goals.
  • πŸ“‹ Task Allocation: ManagerGPT distributes tasks among agents considering their capabilities and project requirements.
  • βš™οΈ Task Execution: Agents execute tasks asynchronously, leveraging their specialized functionalities.
  • πŸ”„ Feedback Loop: Continuous feedback updates users on project progress and addresses issues.

§3. 🌐 Agentic Networking Mode (Orchestrated)

In networking mode, autogpt connects to an external orchestrator (orchgpt) over a secure TLS-encrypted TCP channel. This orchestrator manages agent lifecycles, routes commands, and enables rich inter-agent collaboration using a unified protocol.

AutoGPT introduces a novel and scalable communication protocol called IAC (Inter/Intra-Agent Communication), enabling seamless and secure interactions between agents and orchestrators, inspired by operating system IPC mechanisms.

flowchart TD
    U([User sends prompt via CLI]) -- TLS + Protobuf over TCP --> O[Orchestrator\nReceives & Routes Commands]
    O --> AG[ArchitectGPT]
    O --> MG[ManagerGPT]
    AG <-- IAC --> MG
    subgraph IAC [" IAC - Inter/Intra-Agent Communication Layer"]
        MG
        BG[BackendGPT]
        FG[FrontendGPT]
        DG[DesignerGPT]
    end
    MG -- IAC --> BG
    MG -- IAC --> FG
    MG -- IAC --> DG
    BG & FG & DG --> Exec[Task Execution & Collection]
    Exec --> R([User Receives Final Output])

All communication happens securely over TLS + TCP, with messages encoded in Protocol Buffers (protobuf) for efficiency and structure.

  1. User Input: The user provides a project prompt like:

    /arch create "fastapi app" | python

    This is securely sent to the Orchestrator over TLS.

  2. Initialization: The Orchestrator parses the command and initializes the appropriate agent (e.g., ArchitectGPT).

  3. Agent Configuration: Each agent is instantiated with its specialized goals:

    • ArchitectGPT: Plans system structure
    • BackendGPT: Generates backend logic
    • FrontendGPT: Builds frontend UI
    • DesignerGPT: Handles design
  4. Task Allocation: ManagerGPT dynamically assigns subtasks to agents using the IAC protocol. It determines which agent should perform what based on capabilities and the original user goal.

  5. Task Execution: Agents execute their tasks, communicate with their subprocesses or other agents via IAC (inter/intra communication), and push updates or results back to the orchestrator.

  6. Feedback Loop: Throughout execution, agents return status reports. The ManagerGPT collects all output, and the Orchestrator sends it back to the user.

Β§πŸ€– Available Agents

At the current release, AutoGPT consists of 9 built-in specialized autonomous AI agents ready to assist you in bringing your ideas to life! Refer to our guide to learn more about how the built-in agents work.

Β§πŸ“Œ Examples

Your can refer to our examples for guidance on how to use the cli in a jupyter environment.

Β§πŸ“š Documentation

For detailed usage instructions and API documentation, refer to the AutoGPT Documentation.

§🀝 Contributing

Contributions are welcome! See the Contribution Guidelines for more information on how to get started.

Β§πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

ModulesΒ§

agents
Agents module.
clicli
collaborationnet
common
Common module.
macros
messagecli
orchestratorcli and net
prelude
πŸ“¦ Installation
promptsgpt or cli
Prompts module.
traits
Traits module.

MacrosΒ§

agents