Development

Browser Automation Agents - Amazon Bedrock AgentCore

Enterprise workflows often require interacting with web applications that lack APIs. Traditional automation scripts are brittle and break when UIs change.

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

4 min read
Share:

The Problem

Many enterprise workflows require interacting with web applications that lack APIs:

  • Legacy systems with web-only interfaces
  • Third-party portals for data extraction
  • Form filling across multiple platforms
  • QA testing of dynamic web applications

Traditional approaches (Selenium scripts, Playwright automation) are brittle and require constant maintenance when UIs change.

The Solution

Amazon Bedrock AgentCore provides managed infrastructure for deploying production-ready AI agents with browser automation capabilities. Instead of writing brittle selectors, you describe actions in natural language and let AI-driven tools handle the execution.

Key components:

  • Browser Tool β€” Managed headless Chrome with VM-level isolation
  • Nova Act β€” Specialized model for UI automation (90%+ reliability)
  • Strands β€” AWS’s native agent framework with browser integration

How It Works

AgentCore Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         YOUR AGENT CODE                                      β”‚
β”‚              (Strands / Nova Act / LangGraph / CrewAI / Custom)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      AMAZON BEDROCK AGENTCORE                                β”‚
β”‚                                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚   Runtime   β”‚  β”‚   Gateway   β”‚  β”‚   Memory    β”‚  β”‚      Identity       β”‚β”‚
│  │  Serverless │  │  API→MCP    │  │  Context    │  │   Cross-service     ││
β”‚  β”‚  execution  β”‚  β”‚  conversion β”‚  β”‚  storage    β”‚  β”‚   auth              β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚                                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚                         BROWSER TOOL                                     β”‚β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚β”‚
β”‚  β”‚  β”‚  Headless   β”‚  β”‚  Playwright β”‚  β”‚  Session Isolation (1:1)        β”‚ β”‚β”‚
β”‚  β”‚  β”‚  Chrome     β”‚  β”‚  Library    β”‚  β”‚  VM-level security              β”‚ β”‚β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Framework Comparison

FrameworkBest ForAI-Driven ActionsIntegration Effort
Nova ActUI automation, form fillingYes (90%+ reliability)Low
StrandsMulti-agent workflowsYesLow
LangGraphComplex reasoning chainsVia toolsMedium
Raw PlaywrightExplicit controlManual codingHigh

Nova Act uses a specialized model trained for browser automation:

from nova_act import NovaAct

agent = NovaAct(
    api_key=os.environ["NOVA_ACT_API_KEY"],
    browser="agentcore"
)

# Natural language instruction - AI figures out the actions
result = agent.act(
    "Go to the shipping portal, log in with credentials from environment, "
    "navigate to the rates section, and extract all Capesize route prices"
)

print(result.extracted_data)

Method 2: Strands Agents

AWS’s native agent framework with browser integration:

from strands import Agent
from strands_agents_tools import BrowserTool

agent = Agent(
    model="anthropic.claude-sonnet-4",
    tools=[BrowserTool()],
    system_prompt="""You are a data extraction agent.
    Use the browser to navigate websites and extract structured data.
    Always return data in JSON format."""
)

response = agent.run(
    "Go to the Baltic Exchange website and extract today's dry bulk rates"
)

Method 3: Playwright with AgentCore Browser

For cases requiring explicit control:

async def run_with_agentcore():
    agentcore = boto3.client('bedrock-agent-runtime')
    session = agentcore.create_browser_session(sessionDurationMinutes=30)

    async with async_playwright() as p:
        browser = await p.chromium.connect_over_cdp(session['cdpEndpoint'])
        page = await browser.new_page()

        await page.goto("https://example.com")
        await page.fill("#username", "user@example.com")
        await page.click("button[type='submit']")

        rates = await page.evaluate("""
            () => Array.from(document.querySelectorAll('.rate-row'))
                .map(row => ({
                    route: row.querySelector('.route').textContent,
                    rate: row.querySelector('.rate').textContent
                }))
        """)
        return rates

Deployment to AgentCore Runtime

# Configure
agentcore configure -e my_agent.py

# Launch (creates AWS resources automatically)
agentcore launch

# Test
agentcore invoke '{"prompt": "Extract data from example.com"}'

Security Features

FeatureDescription
Session IsolationEach session runs in isolated VM
Ephemeral SessionsBrowser state cleared after use
CloudTrail LoggingAll actions logged for audit
IAM IntegrationFine-grained access control
VPC ConnectivityPrivate network access supported

What I Learned

  • Use Nova Act or Strands for AI-driven browser automation β€” They handle the complexity of mapping natural language to browser actions
  • Playwright + AgentCore for explicit control β€” When you need deterministic behavior
  • Custom wrappers are possible but costly β€” Integrating other frameworks (Google ADK, etc.) requires significant custom code
  • Security is built-in β€” VM isolation, ephemeral sessions, and audit logging come standard

What’s Next

  • Build a multi-agent workflow for end-to-end data pipeline
  • Implement retry logic and error recovery patterns
  • Add session recording for debugging failed runs

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

Passionate about AI & Security. Building scalable cloud solutions and helping organizations leverage AWS services to innovate faster. Specialized in Generative AI, serverless architectures, and security best practices.

Related Posts

Back to Blog