Four generations in five years.
Code is the most-AI-native thing you do. Models were trained on enormous quantities of it. Tests and compilers give instant, mechanical feedback. Repos have structure. It's not a coincidence that coding is where AI productivity leaps have been the largest and most visible.
In plain English. Coding is the job AI got good at first because the test (does it run? does it pass?) is unambiguous. Other knowledge work is catching up.
journey
title A backend engineer's day with a Gen-4 coding agent
section Morning
Inbox triage by agent: 4: Engineer
Pick a ticket: 5: Engineer
Agent drafts a plan: 4: Engineer
section Build
Engineer approves plan: 5: Engineer
Agent edits files: 4: Engineer
Agent runs tests: 4: Engineer
Engineer reads diff: 5: Engineer
section Ship
Agent writes PR description: 5: Engineer
Agent runs CI: 4: Engineer
Engineer requests changes once: 4: Engineer
Engineer merges: 5: Engineer
The skill the engineer keeps, and gets paid for, is reading the diff — knowing whether the change is correct, safe, and idiomatic. That muscle is the new senior-engineer differentiator.
This chapter traces the four generations of coding assistants and tells you which ones to use, when, and for what.
flowchart LR
G1[Gen 1
2021-2023
Autocomplete
Copilot] --> G2[Gen 2
2023-2024
Chat + inline edits
Cursor, Continue]
G2 --> G3[Gen 3
2024-2025
Agentic in-repo
Claude Code, Cursor agents]
G3 --> G4[Gen 4
2025-2026
Autonomous engineer
Devin, Claude subagents]
GitHub Copilot (June 2021) was the first product everyone loved. It suggested the next line, sometimes the next block. It saved typing. You still designed the software; it filled in the syntax.
What it changed:
What it didn't change:
Cursor, Continue.dev, and Zed AI added a chat panel and inline edits. You could highlight code and say "make this async," or ask "why is this endpoint 500ing?"
This was a bigger step than it looked. For the first time, the model had real context about your codebase — open files, symbols, a whole folder. Quality jumped.
Claude Code (CLI) and Cursor's agent mode pushed further: the model plans, edits multiple files, runs tests, iterates, and summarizes. You describe a ticket; the agent writes the PR.
flowchart TB
T[Ticket] --> P[Plan]
P --> E[Edit files]
E --> R[Run tests / type-check]
R -->|fail| E
R -->|pass| D[Diff + PR description]
D --> H[You review]
What changed:
Devin (Cognition), Claude Code with subagents, GitHub Copilot Workspace, Jules (Google), and others push further still. Give the system an issue; it opens a workspace, sets up the branch, does the work, runs CI, fixes failures, and surfaces a PR.
2026 reality check:
| Task | Best tool (2026) |
|---|---|
| Day-to-day typing + inline edits | Cursor / Zed / VS Code + Copilot |
| Refactor across many files | Claude Code |
| Bug hunt: "why is this failing?" | Claude Code or Cursor agent |
| New feature from a crisp ticket | Claude Code or Gen-4 autonomous |
| Write docs / ADRs / RFCs | Claude Desktop or Cursor chat |
| Unknown repo exploration | Claude Code's repo mapping |
| Code review on a PR | A Claude/GPT-5 reviewer agent |
| Writing / tuning prompts and evals | Claude chat or promptfoo |
| Pure boilerplate + tests | Copilot autocomplete |
| SQL / data exploration | A notebook agent (Hex, Deepnote AI) |
Under the hood, a coding agent is a loop with a specific tool palette. Claude Code's tools, as an example:
flowchart TB
A[Agent] --> T1[Read file]
A --> T2[Write / Edit file]
A --> T3[Grep / Glob]
A --> T4[Bash: run tests, build]
A --> T5[Git diff / status]
A --> T6[WebFetch / WebSearch]
A --> T7[Subagent spawning]
A --> T8[TodoList]
The magic is not any one tool; it's the curation of a small, safe, well-described set, and a system prompt that encodes engineering discipline (read before write, test before commit, small incremental changes).
Both. A common 2026 setup:
The switching cost between these is low; the productivity ceiling is higher if you use them together.
The hardest habit to keep: read every diff.
Why it matters:
Practices that help:
.cursorrules / CLAUDE.md file.Most modern coding assistants read a project-local configuration file:
CLAUDE.md for Claude Code — conventions, commands, key files..cursorrules / .cursor/rules/*.mdc for Cursor — per-topic rules..github/copilot-instructions.md for Copilot.Writing these once pays back for every ticket afterward. A minimal template:
# Conventions
- Language: TypeScript with strict mode
- Framework: Next.js 15 (App Router)
- Styles: Tailwind v4
- Tests: Vitest + React Testing Library
- Linters: eslint (airbnb-base), prettier
# Commands
- `pnpm dev` — start the dev server
- `pnpm test` — run tests
- `pnpm typecheck` — run tsc
# Engineering rules
- Never use `any`.
- All new code must have tests.
- Prefer pure functions; use server components unless interactivity requires otherwise.
- No new dependencies without a short justification in the PR.
Use the agent the way you'd use a smart, tireless pair programmer:
@skip.Coding agents are not free. A non-trivial ticket can spend 100k–1M tokens of context across a session. Habits that help:
Risks specific to coding agents:
Predictions for 2027 with middling confidence:
The coding assistant generation that wins won't be the one with the best model; it will be the one with the best memory, tools, and workflow integration.
"coding agent comparison 2026")