adk-rust-mcp-composer
MCP server for composite media generation. Part of the ADK Rust MCP toolkit.
Overview
Composite media server that orchestrates multiple AI models (Veo, Gemini, Lyria) and FFmpeg to produce rich media outputs — animated GIFs, short-form videos, memes, narrated presentations, and multi-speaker podcasts.
Currently implemented: Google Gemini API (Veo 3.1, Gemini 2.5 Flash, Lyria 3 Clip)
Features
- GIF Generation — Text-to-GIF via Veo video + FFmpeg palette optimization
- Short-Form Video — Vertical 9:16 videos with optional caption overlay
- Meme Generation — AI image with baked-in meme text (Impact style)
- Presentations — Slide-based narrated videos with images, TTS, and background music
- Podcasts — Multi-speaker dialogue with auto-assigned voices and background music
- Dual API — Works with Gemini API key or Vertex AI ADC
Prerequisites
FFmpeg must be installed (for GIF conversion, captions, and audio mixing):
# macOS
# Ubuntu/Debian
Installation
Configuration
# Gemini API (required)
# Or Vertex AI
Tools
gif_generate
Generate an animated GIF from a text prompt. Uses Veo for video generation then converts to optimized GIF.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | — | Text describing the animation |
duration_seconds |
int | No | 4 | Duration (4-8 seconds) |
fps |
int | No | 12 | GIF frame rate |
width |
int | No | 480 | Output width in pixels |
output_file |
string | No | — | Output file path |
short_generate
Generate a vertical short-form video (9:16) with audio and optional caption overlay. Optimized for social media.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | — | Video content description |
caption |
string | No | — | Text overlay (bottom of screen) |
duration_seconds |
int | No | 8 | Duration (4-8 seconds) |
generate_audio |
bool | No | true | Generate audio with video |
output_file |
string | No | — | Output file path |
meme_generate
Generate a meme image with top/bottom text overlay from a text prompt.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | — | Image description (what the meme shows) |
top_text |
string | No | — | Top text |
bottom_text |
string | No | — | Bottom text |
font_size |
int | No | 48 | Font size |
output_file |
string | No | — | Output file path |
presentation_generate
Generate a narrated video presentation from slides. Each slide gets an AI image, TTS narration, and optional background music.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
slides |
array | Yes | — | Array of slide objects |
style |
string | No | professional |
Visual style for images |
voice |
string | No | Kore |
TTS voice name |
background_music |
string | No | — | Music prompt (Lyria 3 Clip) |
music_volume |
float | No | 0.15 | Music volume (0.0-1.0) |
output_file |
string | Yes | — | Output MP4 file path |
Slide format:
| Field | Type | Required | Description |
|---|---|---|---|
title |
string | Yes | Slide title |
content |
string | Yes | Narration text |
image_prompt |
string | No | Custom image prompt (auto-generated if omitted) |
podcast_generate
Generate a multi-speaker podcast/dialogue with background music. Each speaker gets a unique voice.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
script |
array | Yes | — | Dialogue script segments |
background_music |
string | No | — | Music prompt (Lyria 3 Clip) |
music_volume |
float | No | 0.1 | Music volume (0.0-1.0) |
intro_music |
bool | No | false | Add 3s music intro |
output_file |
string | Yes | — | Output file path |
Script segment format:
| Field | Type | Required | Description |
|---|---|---|---|
speaker |
string | Yes | Speaker name |
text |
string | Yes | What they say |
voice |
string | No | Voice name (auto-assigned if omitted) |
style |
string | No | Delivery style (cheerful, serious, etc.) |
Usage Examples
# Stdio (default) — for Claude Desktop, Kiro
# HTTP — for web apps, ADK agents
# SSE — for streaming applications
Generate a GIF
prompt: "A cat chasing a laser pointer across a living room"
fps: 15
width: 640
output_file: "cat_laser.gif"
Create a short-form video
prompt: "Satisfying coffee pour in slow motion, steam rising"
caption: "Monday motivation ☕"
duration_seconds: 6
output_file: "coffee_short.mp4"
Generate a meme
prompt: "A developer staring at a screen with code"
top_text: "IT WORKS ON MY MACHINE"
bottom_text: "SHIPS TO PRODUCTION"
output_file: "dev_meme.png"
Create a presentation
Generate a podcast
Output Specs
| Tool | Format | Details |
|---|---|---|
| gif_generate | GIF | Palette-optimized, configurable FPS/width |
| short_generate | MP4 (H.264) | 9:16 vertical, with audio |
| meme_generate | PNG | Square (1:1) with text overlay |
| presentation_generate | MP4 (H.264) | Slide images + narration + music |
| podcast_generate | WAV | 24kHz mono, multi-speaker dialogue |
License
Apache-2.0