Give your agent control over mobile apps.

The mobile-native control layer for AI agents.See the UI. Act on it. No hacks.

Read the Docs

Inspired by the browser automation patterns that made web agents practical, but built for native mobile apps.

Terminal
# An agent exploring a real iOS app
$ agent-device open Settings --platform ios
$ agent-device snapshot -i
# @e1 [heading] "Settings"
# @e2 [button] "Wi-Fi"
$ agent-device press @e2
$ agent-device snapshot -i

Give your agent "eyes" on the UI

Agents need a structured way to understand the current screen. snapshot exposes the accessibility tree and assigns stable refs (like @e2), keeping context smaller than raw screenshots or verbose dumps.

# Capture only interactive elements, formatted for LLMs
$ agent-device snapshot -i -c -d 4
# Output:
# @e1 [heading] "Sample App"
# @e2 [button] "Sign In"
# @e3 [input] "Email Address"

Act with semantic precision

Once the agent understands the screen, it acts. It can use stable refs or rely on semantic selectors to find elements by text, label, or role—making flows readable and resilient.

# Interact using stable refs...
$ agent-device fill @e3 "agent@example.com"
# ...or semantic selectors
$ agent-device find "Sign In" click
$ agent-device find role button click

Cross-platform by default

One mental model across iOS, Android, and TV. The same agent workflow moves seamlessly between development simulators, physical QA devices, and TV-class targets.

$ agent-device open com.example.app --platform android --serial emulator-5554
$ agent-device open Settings --platform ios
$ agent-device open YouTube --platform android --target tv

From Exploration to E2E Replay

This is where exploration becomes infrastructure. Agents can explore a UI, save the flow to a script, and re-run it later. Experimental auto-updating (-u) can even heal stale selectors on the fly.

# 1. Record an exploratory session
$ agent-device open Settings --platform ios --session e2e --save-script
# 2. Replay it deterministically later
$ agent-device replay ~/.agent-device/sessions/e2e-run.ad
# 3. Heal stale selectors automatically (Experimental)
$ agent-device replay -u ~/.agent-device/sessions/e2e-run.ad

Make your agent productive from the first prompt.

Don't make your agent learn the tool from scratch. Install the official agent-device skills to give it the canonical workflows, targeting rules, debugging patterns, and replay guidance it needs to automate mobile apps effectively.

# Give your agent the official toolset
$ npx skills add callstackincubator/agent-device
# Add the structured exploratory QA skill
$ npx skills add callstackincubator/agent-device --skill dogfood

Not just a CLI. A fully typed integration layer.

Use the typed TypeScript client to integrate directly into your codebase. Manage sessions, provision devices, and capture snapshots programmatically.

client.ts
1import { createAgentDeviceClient } from 'agent-device';
2
3const client = createAgentDeviceClient({
4 session: 'qa-ios',
5 lockPolicy: 'reject',
6});
7
8// Boot the right target
9const ensured = await client.simulators.ensure({
10 device: 'iPhone 16',
11 boot: true,
12});
13
14// Navigate and capture
15await client.apps.open({ app: 'com.apple.Preferences', platform: 'ios' });
16const snapshot = await client.capture.snapshot({ interactiveOnly: true });
17
18await client.sessions.close();

Built for debugging, not just the happy path.

When things break, agents and humans both need evidence. Capture parsed logs, network dumps, and screen recordings without leaving the terminal.

Log & Network Parsing

$ agent-device logs mark "before submit"
$ agent-device network dump 25

Visual Proof

$ agent-device screenshot error.png
$ agent-device record start fail-state.mp4

Environment & Push Triggers

$ agent-device settings airplane on
$ agent-device push com.example.app '{"aps":{"alert":"Welcome"}}'

What agents build with it

Exploratory QA

Navigate apps, identify problems, collect visual evidence.

Bug Reproduction

Deep link to a state, capture logs, and preserve exact steps.

Regression Loops

Turn successful exploration into replayable .ad scripts.

Clean-State Testing

Reinstall builds, manage permissions, test onboarding.