Cloud

Deploying a Personal AI Assistant on AWS with Bedrock AgentCore Runtime

A hands-on walkthrough of deploying OpenClaw on AWS using AgentCore Runtime for serverless agent execution, Graviton ARM instances, and multi-model Bedrock access — from CloudFormation template to customizing the agent's personality.

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

13 min read
Share:

Running an AI assistant on your own infrastructure gives you control over data, costs, and model selection that managed platforms don’t offer. This post walks through deploying OpenClaw on AWS using Bedrock AgentCore Runtime — a serverless execution layer that auto-scales agent containers and charges only when agents run — then goes deep into how the system actually works under the hood.

The Problem

Most personal AI assistants require managing API keys across multiple providers, running fixed compute capacity that costs money even when idle, and dealing with scaling bottlenecks on a single machine. If you want enterprise-grade security (audit trails, no API keys in config files, private networking), you’re looking at significant infrastructure work.

The specific challenges:

  • API key sprawl — juggling keys from Anthropic, OpenAI, etc., with credentials sitting in plaintext config files
  • Fixed compute costs — an EC2 instance running 24/7 whether anyone’s using the assistant or not
  • No audit trail — no way to track which models were invoked, when, or by whom
  • Single model lock-in — switching between Claude, Nova, or DeepSeek requires code changes and redeployment
  • Manual scaling — if traffic spikes, the single instance becomes a bottleneck

The Solution

OpenClaw is an open-source AI assistant that connects to WhatsApp, Telegram, Discord, and Slack. The AWS deployment template wraps it in a CloudFormation stack that deploys an EC2 gateway (Graviton ARM for cost efficiency) paired with Bedrock AgentCore Runtime for serverless agent execution. IAM roles replace all API keys, CloudTrail logs every model invocation, and you can switch between 8+ Bedrock models with a single parameter change.

OpenClaw on AWS architecture — User sends messages through messaging platforms to EC2 Gateway, which routes to AgentCore Runtime running containerized agents that invoke Bedrock models

The architecture splits into three layers:

  1. EC2 Gateway — handles messaging platform connections (WhatsApp, Telegram, etc.), serves the Web UI, and routes agent requests to AgentCore
  2. AgentCore Runtime — serverless execution environment that runs containerized OpenClaw agents in isolated microVMs, auto-scaling based on demand
  3. Bedrock Models — the LLMs (Opus, Sonnet, Nova) invoked by the agent container via IAM authentication

How It Works

The Three-Layer Architecture

Understanding how a message flows through the system makes debugging and customization much easier.

1. You send "Hello" on WhatsApp
2. WhatsApp Web → Gateway receives the message
3. Gateway identifies the channel + session
4. Gateway → AgentCore Runtime (IAM auth)
5. AgentCore starts the container in an isolated microVM
6. Container loads context (SOUL.md, session history, memory)
7. Container → Bedrock InvokeModel (Opus 4.6)
8. Bedrock returns the response
9. Container → Gateway → WhatsApp → You receive the reply
10. CloudTrail logs the Bedrock API call

The Gateway is a Node.js server running on the EC2 instance. It manages all messaging channel connections — each channel (WhatsApp, Telegram, Discord, Slack) is a plugin that can be enabled/disabled. It also serves the Web UI on port 18789 and maintains session state.

AgentCore Runtime is the AWS-managed layer. When the gateway routes a request, AgentCore spins up a Docker container in an isolated microVM. The container runs the OpenClaw agent logic, calls Bedrock, and returns the response. If 10 messages arrive simultaneously, 10 microVMs start. When done, they shut down. No idle costs.

The Agent Container holds the actual intelligence. It reads workspace files (SOUL.md for personality, TOOLS.md for available capabilities) and session history, then constructs a prompt and calls Bedrock. The container image is built from the OpenClaw source and stored in ECR.

Choosing Your Models

Bedrock provides access to multiple model families through a single API. The CloudFormation template accepts any of these as the default:

ModelModel IDInput / Output per 1M tokensBest for
Nova 2 Liteglobal.amazon.nova-2-lite-v1:0$0.30 / $2.50Daily tasks, 90% cheaper than Claude
Claude Opus 4.6global.anthropic.claude-opus-4-6-v1:0~$3 / $15Complex reasoning, deep analysis
Claude Sonnet 4.5global.anthropic.claude-sonnet-4-5-20250929-v1:0$3 / $15Balanced capability and speed
Nova Prous.amazon.nova-pro-v1:0$0.80 / $3.20Multimodal, balanced cost
DeepSeek R1us.deepseek.r1-v1:0$0.55 / $2.19Open-source reasoning

The OpenClawModel parameter sets the default, but the IAM role grants bedrock:InvokeModel* on all resources — so you can configure model routing in the Web UI to use different models per channel. Use Nova Lite for casual conversations and route complex tasks to Opus.

One caveat: models with a us. prefix (DeepSeek, Llama, Nova Pro) only work in US regions. For eu-west-1, stick to global. prefixed models.

Adding a Model to the Template

The original template didn’t include Claude Opus 4.6. Adding it required one edit to the CloudFormation AllowedValues:

OpenClawModel:
  Type: String
  Default: "global.amazon.nova-2-lite-v1:0"
  AllowedValues:
    - "global.amazon.nova-2-lite-v1:0"
    - "global.anthropic.claude-sonnet-4-5-20250929-v1:0"
    - "global.anthropic.claude-opus-4-5-20251101-v1:0"
    - "global.anthropic.claude-opus-4-6-v1:0"      # Added
    # ... other models

The actual Bedrock model ID (anthropic.claude-opus-4-6-v1) was confirmed by querying the API:

aws bedrock list-foundation-models --region eu-west-1 \
  --query "modelSummaries[?contains(modelId, 'opus')].[modelId,modelName]"

Building and Pushing the Agent Container

AgentCore Runtime runs your agent as a Docker container in isolated microVMs. The container needs to be built from the OpenClaw source and pushed to ECR before deploying the CloudFormation stack.

# Create ECR repository
aws ecr create-repository \
  --repository-name openclaw-agentcore-agent \
  --region eu-west-1

# Clone and build
git clone https://github.com/openclaw/openclaw.git
cd openclaw
docker build -t openclaw-agentcore-agent:latest .

# Push to ECR
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
aws ecr get-login-password --region eu-west-1 | \
  docker login --username AWS --password-stdin \
  ${ACCOUNT_ID}.dkr.ecr.eu-west-1.amazonaws.com

docker tag openclaw-agentcore-agent:latest \
  ${ACCOUNT_ID}.dkr.ecr.eu-west-1.amazonaws.com/openclaw-agentcore-agent:latest
docker push \
  ${ACCOUNT_ID}.dkr.ecr.eu-west-1.amazonaws.com/openclaw-agentcore-agent:latest

The CloudFormation template references this image URI directly — it needs to exist in ECR before the AgentCore Runtime resource can be created.

Deploying the Stack

One command deploys everything — VPC, subnets, IAM roles, EC2 instance, AgentCore Runtime, and SSM configuration:

aws cloudformation create-stack \
  --stack-name openclaw-agentcore \
  --template-body file://clawdbot-bedrock-agentcore.yaml \
  --capabilities CAPABILITY_NAMED_IAM \
  --region eu-west-1 \
  --parameters \
    ParameterKey=KeyPairName,ParameterValue=openclaw-agentcore \
    ParameterKey=InstanceType,ParameterValue=c7g.large \
    ParameterKey=OpenClawModel,ParameterValue=global.anthropic.claude-opus-4-6-v1:0 \
    ParameterKey=EnableAgentCore,ParameterValue=true \
    ParameterKey=CreateVPCEndpoints,ParameterValue=false

Key parameter decisions:

  • c7g.large (Graviton ARM) — 20-40% better price-performance than equivalent x86 instances. A c7g.large runs ~$30-40/month vs ~$50+ for c5.xlarge
  • EnableAgentCore=true — agents run serverless in microVMs instead of on the EC2 instance itself. Pay-per-invocation, no idle costs
  • CreateVPCEndpoints=false — saves ~$22/month. Traffic goes through the public internet instead of staying within the AWS network. Acceptable for a personal deployment; for production, enable them

The stack takes about 10-15 minutes. The EC2 UserData script handles Node.js installation, OpenClaw setup, gateway configuration, daemon initialization, and messaging plugin enablement automatically.

How SSM Port Forwarding Works

No public ports are exposed. The Web UI at http://localhost:18789 works through an SSM Session Manager tunnel:

Your laptop (localhost:18789)

    │  SSM Session Manager (encrypted WebSocket)
    │  via AWS Systems Manager API (HTTPS, port 443)


EC2 instance (127.0.0.1:18789)

    │  OpenClaw Gateway (Node.js, loopback-only)


Web UI served

When you run the port forwarding command, the SSM plugin on your machine opens a WebSocket to the AWS SSM API. The SSM Agent on the EC2 instance maintains its own connection to the same API. AWS bridges the two — any TCP traffic to localhost:18789 gets tunneled through and forwarded to the instance. The gateway binds to loopback only ("bind": "loopback" in the config), so the SSM tunnel is the only way in.

# Start port forwarding (keep terminal open)
aws ssm start-session \
  --target $INSTANCE_ID \
  --region eu-west-1 \
  --document-name AWS-StartPortForwardingSession \
  --parameters '{"portNumber":["18789"],"localPortNumber":["18789"]}'

Then retrieve the gateway token and open the UI:

aws ssm get-parameter \
  --name "/openclaw/openclaw-agentcore/gateway-token" \
  --region eu-west-1 \
  --with-decryption \
  --query 'Parameter.Value' --output text

Open http://localhost:18789/?token=<TOKEN> — from there you can connect WhatsApp (QR code scan), Telegram (BotFather token), Discord, or Slack.

Why not just open a port? No public IP exposure, no SSH keys to manage, encrypted end-to-end via HTTPS/WebSocket, and every session logged in CloudTrail. Close the terminal and the tunnel dies.

Workspace Files and Customization

OpenClaw’s personality and behavior are defined by markdown files in ~/.openclaw/workspace/. These are read by the agent at the start of every session — they’re effectively the system prompt.

FilePurpose
SOUL.mdPersonality, communication style, core behaviors
IDENTITY.mdAgent name, avatar, vibe
USER.mdInformation about the user (you)
AGENTS.mdAgent definitions and routing
TOOLS.mdAvailable tool configurations

SOUL.md is the most important one. The default template is generic — “Be genuinely helpful, not performatively helpful.” For a useful deployment, customize it with your context:

# SOUL.md

You are a personal AI assistant for [Your Name], [Your Role].

## Core Truths
- Be direct and technical — skip the hand-holding
- Be bilingual — match the user's language
- Have opinions on architecture — take a position, flag costs

## Context
- Primary AWS regions: eu-west-1, us-east-1
- Focus areas: AI/ML, serverless, enterprise architecture
- Model: Claude Opus 4.6 via Amazon Bedrock

## Communication Style
- Concise by default, tables over paragraphs
- Include CLI commands when practical
- No corporate filler, no sycophancy

The agent reads these files every session. Update them and the behavior changes immediately — no redeployment needed.

The Skills System

Skills are plugin packages that extend the agent’s capabilities. Each skill is a folder containing a SKILL.md (instructions for the agent on when and how to use a CLI tool) plus optional install scripts.

OpenClaw ships with 50 bundled skills. On a fresh deployment, 8 are ready out of the box:

ReadyNotable missing (installable)
clawhub (skill marketplace)obsidian (vault management)
github (gh CLI)slack (messaging)
healthcheck (security audit)coding-agent (Claude Code, Codex)
himalaya (email via IMAP)blogwatcher (RSS feeds)
mcporter (MCP servers)summarize (URLs, podcasts)
skill-creatornotion
tmux (session control)session-logs
weathervoice-call

Install additional skills from the CLI or the Web UI at /skills:

openclaw skills install obsidian
openclaw skills install slack
npx clawhub search <keyword>

Skills are essentially structured prompt injection — the agent reads the skill’s SKILL.md to learn what commands are available and when to use them. No code changes, no redeployment. Drop a markdown file and the agent gains new capabilities.

Using the CLI Remotely

You can interact with OpenClaw from your local machine without SSM-ing into the instance, using aws ssm send-command:

# Check status
aws ssm send-command \
  --instance-ids $INSTANCE_ID \
  --region eu-west-1 \
  --document-name "AWS-RunShellScript" \
  --parameters 'commands=["sudo -u ubuntu bash -c '\''export HOME=/home/ubuntu && export NVM_DIR=$HOME/.nvm && . $NVM_DIR/nvm.sh && openclaw status'\''"]' \
  --query "Command.CommandId" --output text

# Get the output
aws ssm get-command-invocation \
  --command-id <COMMAND_ID> \
  --instance-id $INSTANCE_ID \
  --region eu-west-1 \
  --query "StandardOutputContent" --output text
# Send a message to the agent
openclaw agent --agent main --message "What model are you running on?"

If you SSM directly into the instance, the OpenClaw CLI needs NVM loaded first:

bash
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh"
openclaw status

The default SSM shell is sh, not bash — NVM won’t load without switching to bash first.

Debugging Empty Model Responses

After deploying, I hit a frustrating issue: the chat UI accepted messages but the assistant returned empty responses. The gateway logs showed agent runs completing in ~490ms — way too fast for a real Bedrock API call — with no errors.

The root cause was in ~/.openclaw/openclaw.json. The CloudFormation template had set the model ID to global.anthropic.claude-opus-4-6-v1:0, but the correct Bedrock cross-region inference profile ID is global.anthropic.claude-opus-4-6-v1without the :0 suffix.

Verify correct inference profile IDs with:

aws bedrock list-inference-profiles --region eu-west-1 --output json | \
  python3 -c "
import json, sys
data = json.load(sys.stdin)
for p in data.get('inferenceProfileSummaries', []):
    print(f\"{p['inferenceProfileId']}  ->  {p['inferenceProfileName']}\")
" | grep -i opus

The fix was a one-line sed on the instance via SSM, then a gateway restart. The subtle distinction: CloudFormation AllowedValues accepts any string, so the :0 suffix doesn’t cause a deployment error. It only fails silently at runtime.

Cost Breakdown

ComponentMonthly cost
EC2 c7g.large (Graviton)~$30-40
EBS 30GB gp3~$2.40
AgentCore RuntimePay-per-use (serverless)
VPC Endpoints (disabled)$0
Bedrock usage (100 conv/day, Opus 4.6)~$15-25
Total~$48-68

Switching to Nova 2 Lite as the default model drops Bedrock costs to ~$5-8/month. The smart play is routing simple conversations to Nova and only escalating to Opus for complex tasks.

What I Learned

  • AgentCore Runtime is the key differentiator — it eliminates idle compute costs entirely. Agents spin up in microVMs on demand and shut down when done. For a personal assistant with sporadic usage, this means you’re not paying for an EC2 instance to run agent workloads 24/7
  • VPC limits are a real deployment blocker — the default limit of 5 VPCs per region can catch you off guard. The CloudFormation stack creates its own VPC, so if you’re already at the limit you’ll hit a CREATE_FAILED with no helpful error message until you dig into the stack events
  • Cross-region model availability matters — models with us. prefix IDs (DeepSeek R1, Llama 3.3, Nova Pro) aren’t available in EU regions. Only global. prefixed models work cross-region
  • Bedrock inference profile IDs don’t match foundation model IDs — foundation model IDs use modelId:version format (e.g., anthropic.claude-opus-4-6-v1:0), but cross-region inference profiles drop the version suffix. If your agent returns empty responses with no errors, check the model ID first
  • SSM port forwarding replaces SSH entirely — no public ports, no key management, encrypted WebSocket tunnel through the AWS API. The gateway binds to loopback only, making SSM the single entry point. Better security posture than opening port 22
  • Skills are just markdown files — the entire plugin system is structured prompt injection. Drop a SKILL.md that documents a CLI tool, and the agent learns to use it. No code changes, no container rebuilds, no redeployment. This is a powerful pattern for extending AI agents
  • Workspace files are the real customization layerSOUL.md defines who the agent is, USER.md tells it who you are. Changing these files changes the agent’s behavior immediately. Think of them as a persistent system prompt that survives across sessions

What’s Next

  • Configure multi-model routing — Nova Lite as default, Opus for complex reasoning
  • Connect WhatsApp and Telegram as primary messaging channels
  • Install obsidian and slack skills for workflow integration
  • Test voice message transcription capabilities
  • Benchmark AgentCore cold start latency vs always-on EC2 agent execution
  • Build a custom skill for AWS architecture review automation
Alexandre Agius

Alexandre Agius

AWS Solutions Architect

Passionate about AI & Security. Building scalable cloud solutions and helping organizations leverage AWS services to innovate faster. Specialized in Generative AI, serverless architectures, and security best practices.

Related Posts

Back to Blog