Cloud agents in practice: Devin, Codex, and when a cloud AI developer makes sense
Jump to section
Cloud agents — AI that runs in an isolated cloud environment, with its own VM, browser, and editor, handling entire tasks autonomously. Sounds like the future. Reality is more nuanced — and much more practical than the marketing suggests.
What cloud agents are and how they differ from CLI agents
Unlike CLI agents (Claude Code, Aider) that run on your machine, cloud agents run remotely. They have their own isolated environment — VM, file system, browser. They accept a task, work on it autonomously, and return the result (typically a PR).
- CLI agent (Claude Code, Aider): runs locally, full access to your environment, interactive, immediate feedback
- Cloud agent (Devin, Codex): runs remote, isolated environment, autonomous, returns result after completion
- Key difference: CLI agent has YOUR context (environment, files, DB). Cloud agent only has what you give it.
Key players in 2026
Devin (Cognition)
The first major cloud agent. Own IDE, browser, terminal. Can clone a repo, understand code, implement changes, run tests, create a PR. Pricing: from $500/month. In practice: good for clearly defined, isolated tasks. Ambitious tasks need a lot of guidance and iterations.
Real experience with Devin: on a simple bug fix (clear reproduction, isolated file) it works great. On 'add authentication to the API' — it needed 3 iterations and human correction. Expect 'junior developer' level, not 'senior engineer.'
OpenAI Codex (cloud)
Sandbox environment connected to GitHub. Works with GPT and o3 models. Safe isolated environment — code has no access to your systems. In practice: similar limits as Devin — works on routine tasks, struggles on complex ones. Advantage: GitHub ecosystem integration.
GitHub Copilot Workspace
Plans and implements changes directly from GitHub issues. Integrated with GitHub ecosystem — you see the plan and can modify it before implementation. In practice: best for small, well-described changes. Complex tasks require a lot of manual correction.
Where cloud agents work — and where they don't
Work well
- Bug fixes with clear reproduction steps and isolated files
- Routine migrations (update dependency, rename across codebase)
- Generating boilerplate code from specifications
- Simple feature implementations with clear specs
- Automated code review and static analysis
- Documentation from existing code
Rule: anything you'd describe to a junior as 'do exactly this' is a good candidate for a cloud agent.
Don't work well
- Vague specs ('improve this page')
- Cross-cutting concerns (changes across the entire architecture)
- Tasks requiring deep domain context
- Anything touching auth, payments, or data integrity
- Architecture decisions requiring knowledge of historical decisions
- Debugging that requires production access
Cloud agents lack the context of your team, your conventions, your unwritten rules. They don't know your deployment pipeline, your monitoring, your incident history.
Hybrid approach: best of both worlds
Claude Code remote
An interesting alternative: Claude Code runs locally on your machine (full context) but you control it remotely via phone or browser. Combines local agent advantages (your environment, your context) with cloud control convenience.
# Hybrid approach: Claude Code + tmux
# On your machine (or server):
tmux new-session -s claude-work
claude
# From anywhere (phone, another PC):
ssh your-server
tmux attach -t claude-work
# Full access to your environment,
# control from anywhere. No cloud
# agent limitations.For most teams, this is more practical than a pure cloud agent. You have full context, full access, and can work from anywhere.
Cloud agents for specific tasks
Best use of cloud agents: overnight tasks. Before leaving work, assign the cloud agent a routine task — 'update all dependencies, run tests, fix what broke.' In the morning you have a PR to review. No human capacity wasted.
Economics of cloud agents
- Devin: from $500/month — worth it only if it saves more than 1 senior developer day per month
- Codex: integrated in OpenAI plan — lower barrier to entry
- Copilot Workspace: included in Copilot Enterprise — no additional costs
- Claude Code (local): $100/month Max — best value for most use cases
Recommendation
Cloud agents are a supplement, not a replacement. Start with local tools (Claude Code, Cursor), master the basics. Add cloud agents for specific use cases — routine tasks, overnight migrations, parallel work on isolated tasks.
Don't get caught in the hype: a fully autonomous AI developer doesn't exist yet. What exists is a powerful tool for specific, well-defined tasks. And that's still very useful.
Realistic expectations: cloud agent = junior developer who's fast, tireless, but needs precise specs and review. If your specs aren't precise, invest time in improving them — not in a more powerful agent.
Karel Čech
Developer and AI consultant. I help technical teams adopt AI in their daily workflow — from workshops to long-term strategies.
LinkedIn →Stay ahead with AI insights
Practical tips on AI for dev teams. No spam, unsubscribe anytime.
Liked this post? Dive deeper with our course:
Related posts
CLI agents: why the terminal beats the editor for complex AI tasks
Claude Code, Aider, Goose, Codex CLI — terminal agents have access to everything you do. The editor sees files. The terminal sees the system. That's the fundamental difference.
AI Agents in 2026: What Changed and How Developers Use Them
From chat to autonomous agents. 55% of developers regularly use AI agents. What this means for your workflow and how to get started.
AI for the whole team: shared workspaces, collective agents, and team workflows
Every developer prompts on their own. That's wasteful. AI is much more powerful when the team uses it in coordination — here's how.
Ready to start?
Free 30-minute consultation — we'll figure out where AI can level up your team the most.
Book a free consultation