Chapter 14 · The Coding Assistant Evolution

Four generations in five years.


Code is the most-AI-native thing you do. Models were trained on enormous quantities of it. Tests and compilers give instant, mechanical feedback. Repos have structure. It's not a coincidence that coding is where AI productivity leaps have been the largest and most visible.

In plain English. Coding is the job AI got good at first because the test (does it run? does it pass?) is unambiguous. Other knowledge work is catching up.

A coding assistant on a Tuesday

journey
    title A backend engineer's day with a Gen-4 coding agent
    section Morning
      Inbox triage by agent: 4: Engineer
      Pick a ticket: 5: Engineer
      Agent drafts a plan: 4: Engineer
    section Build
      Engineer approves plan: 5: Engineer
      Agent edits files: 4: Engineer
      Agent runs tests: 4: Engineer
      Engineer reads diff: 5: Engineer
    section Ship
      Agent writes PR description: 5: Engineer
      Agent runs CI: 4: Engineer
      Engineer requests changes once: 4: Engineer
      Engineer merges: 5: Engineer

The skill the engineer keeps, and gets paid for, is reading the diff — knowing whether the change is correct, safe, and idiomatic. That muscle is the new senior-engineer differentiator.

This chapter traces the four generations of coding assistants and tells you which ones to use, when, and for what.

14.1 Four generations

flowchart LR
    G1[Gen 1
2021-2023
Autocomplete
Copilot] --> G2[Gen 2
2023-2024
Chat + inline edits
Cursor, Continue] G2 --> G3[Gen 3
2024-2025
Agentic in-repo
Claude Code, Cursor agents] G3 --> G4[Gen 4
2025-2026
Autonomous engineer
Devin, Claude subagents]

Gen 1 — Autocomplete (2021–2023)

GitHub Copilot (June 2021) was the first product everyone loved. It suggested the next line, sometimes the next block. It saved typing. You still designed the software; it filled in the syntax.

What it changed:

What it didn't change:

Gen 2 — Chat + inline edits (2023–2024)

Cursor, Continue.dev, and Zed AI added a chat panel and inline edits. You could highlight code and say "make this async," or ask "why is this endpoint 500ing?"

This was a bigger step than it looked. For the first time, the model had real context about your codebase — open files, symbols, a whole folder. Quality jumped.

Gen 3 — Agentic in-repo (late 2024 – 2025)

Claude Code (CLI) and Cursor's agent mode pushed further: the model plans, edits multiple files, runs tests, iterates, and summarizes. You describe a ticket; the agent writes the PR.

flowchart TB
    T[Ticket] --> P[Plan]
    P --> E[Edit files]
    E --> R[Run tests / type-check]
    R -->|fail| E
    R -->|pass| D[Diff + PR description]
    D --> H[You review]

What changed:

Gen 4 — Autonomous engineer (2025–2026)

Devin (Cognition), Claude Code with subagents, GitHub Copilot Workspace, Jules (Google), and others push further still. Give the system an issue; it opens a workspace, sets up the branch, does the work, runs CI, fixes failures, and surfaces a PR.

2026 reality check:

14.2 Which tool for which job

Task Best tool (2026)
Day-to-day typing + inline edits Cursor / Zed / VS Code + Copilot
Refactor across many files Claude Code
Bug hunt: "why is this failing?" Claude Code or Cursor agent
New feature from a crisp ticket Claude Code or Gen-4 autonomous
Write docs / ADRs / RFCs Claude Desktop or Cursor chat
Unknown repo exploration Claude Code's repo mapping
Code review on a PR A Claude/GPT-5 reviewer agent
Writing / tuning prompts and evals Claude chat or promptfoo
Pure boilerplate + tests Copilot autocomplete
SQL / data exploration A notebook agent (Hex, Deepnote AI)

14.3 The anatomy of a modern coding agent

Under the hood, a coding agent is a loop with a specific tool palette. Claude Code's tools, as an example:

flowchart TB
    A[Agent] --> T1[Read file]
    A --> T2[Write / Edit file]
    A --> T3[Grep / Glob]
    A --> T4[Bash: run tests, build]
    A --> T5[Git diff / status]
    A --> T6[WebFetch / WebSearch]
    A --> T7[Subagent spawning]
    A --> T8[TodoList]

The magic is not any one tool; it's the curation of a small, safe, well-described set, and a system prompt that encodes engineering discipline (read before write, test before commit, small incremental changes).

14.4 IDE, CLI, or both?

Both. A common 2026 setup:

The switching cost between these is low; the productivity ceiling is higher if you use them together.

14.5 Reviewing the agent's work

The hardest habit to keep: read every diff.

Why it matters:

Practices that help:

14.6 Configuration files that matter

Most modern coding assistants read a project-local configuration file:

Writing these once pays back for every ticket afterward. A minimal template:

# Conventions

- Language: TypeScript with strict mode
- Framework: Next.js 15 (App Router)
- Styles: Tailwind v4
- Tests: Vitest + React Testing Library
- Linters: eslint (airbnb-base), prettier

# Commands

- `pnpm dev` — start the dev server
- `pnpm test` — run tests
- `pnpm typecheck` — run tsc

# Engineering rules

- Never use `any`.
- All new code must have tests.
- Prefer pure functions; use server components unless interactivity requires otherwise.
- No new dependencies without a short justification in the PR.

14.7 Pair-programming hygiene

Use the agent the way you'd use a smart, tireless pair programmer:

14.8 Cost and latency realism

Coding agents are not free. A non-trivial ticket can spend 100k–1M tokens of context across a session. Habits that help:

14.9 Security and agent-in-the-loop coding

Risks specific to coding agents:

14.10 Where this is going

Predictions for 2027 with middling confidence:

The coding assistant generation that wins won't be the one with the best model; it will be the one with the best memory, tools, and workflow integration.

Further reading & watching