# Self-Documenting CLI Design for LLMs

> Agents start fresh every session. Instead of dumping docs upfront, build tools they can query. One take on agent-friendly tooling.

URL: https://kumak.dev/self-documenting-cli-design-for-llms/
Published: 2025-11-28
Category: philosophy

I'm building a CLI tool for browser debugging. It lets AI agents control Chrome through the DevTools Protocol: capture screenshots, inspect network requests, execute JavaScript. The Chrome DevTools Protocol has 53 domains and over 600 methods. That's a lot of capability and a lot of documentation.

Here's the problem: how do I teach an agent what's possible without dumping thousands of tokens into context every session?

Documentation is a wall of text about things you don't need yet. Worse, it drifts. The tool ships a new version, someone forgets to update the docs, and now the agent is following instructions for a method that was renamed three months ago. The tool and its documentation are two artifacts pretending to be one.

When Claude gets stuck with CLI tools, it naturally reaches for `--help`. When that's not enough, it tries `command subcommand --help`. The pattern is consistent: ask the tool, learn from the response, try again.

If `--help` is the agent's natural discovery method, how far can you push it?

## Progressive Disclosure

Instead of documenting everything upfront, make every layer queryable. Watch the conversation unfold:

```shell
# Agent asks: "What can you do?"
bdg --help --json

# Agent asks: "What domains exist?"
bdg cdp --list

# Agent asks: "What can I do with Network?"
bdg cdp Network --list

# Agent asks: "How do I get cookies?"
bdg cdp Network.getCookies --describe

# Agent executes with confidence
bdg cdp Network.getCookies
```

Each answer reveals exactly what's needed for the next question. Five interactions, zero documentation. The tool taught itself.

When the agent doesn't know the exact method name, semantic search bridges the gap:

```shell
$ bdg cdp --search cookie
Found 14 methods matching "cookie":

  Network.getCookies
    # Returns all browser cookies for the current URL
  Network.setCookie
    # Sets a cookie with the given cookie data
  Network.deleteCookies
    # Deletes browser cookies with matching name
  ...
```

The agent thinks "I need something with cookies" and the tool finds everything relevant. No guessing required.

## Errors That Teach

Actionable error messages have been a UX best practice for decades. What's different for agents is the stakes: humans can work around bad UX by searching Stack Overflow. Agents can't. They're stuck with what you give them, racing against a context window that's always shrinking.

And agents make mistakes constantly. They'll type `Network.getCokies` instead of `Network.getCookies`. They'll invent plausible-sounding methods that don't exist.

A typical error:

```shell
$ bdg cdp Network.getCokies
Error: Method not found
```

Now what? The agent has to guess, search, retry. Burn tokens.

Teaching errors provide the path forward:

```shell
$ bdg cdp Network.getCokies
Error: Method 'Network.getCokies' not found

Did you mean:
  - Network.getCookies
  - Network.setCookies
  - Network.setCookie
```

The correction arrives in the same response as the error. No round trip. The agent adapts immediately.

The fuzzy matching goes beyond typos. Try `Networking.getCookies` with the wrong domain name, and it still suggests `Network.getCookies`. The tool understands what you meant, not just what you typed.

Even empty results guide forward:

```shell
$ bdg dom query "article h2"
No nodes found matching "article h2"

Suggestions:
  Verify: bdg dom eval "document.querySelector('article h2')"
  List:   bdg dom query "*"
```

And success states show next steps:

```shell
$ bdg dom query "h1, h2, h3"
Found 5 nodes:
  [0] <h2> Recent Posts
  [1] <h3> Testing in the Age of AI Agents
  ...

Next steps:
  Get HTML: bdg dom get 0
  Details:  bdg details dom 0
```

Every interaction answers "what now?" Errors suggest fixes. Empty results suggest alternatives. Success shows what to do with the data.

## Semantic Exit Codes

Most tools return 1 for any error. Not helpful. Semantic exit codes create ranges with meaning:

- **80-89**: User errors. Bad input, fix it before retrying.
- **100-109**: External errors. API timeout, retry with backoff.

The agent can branch its logic without parsing error messages. Message, suggestion, exit code: three layers of guidance stacked together.

## The Result

I tested this with an agent starting from zero knowledge. No prior context, no documentation provided. Just the tool.

Five commands later, it was executing CDP methods successfully. It discovered the tool's structure, explored the domains, found the method it needed, understood the parameters, and executed.

When I introduced typos deliberately, the suggestions caught them. When commands failed, the exit codes pointed toward solutions. The agent recovered without external help.

The context cost? Roughly 500 tokens for discovery, versus thousands for a documentation dump. And those 500 tokens bought understanding, not just information.

## Design for Dialogue

External documentation will always drift from reality. The tool itself never lies about its own capabilities.

Tools designed for agents aren't dumbed down. They're more explicit. They expose their structure. They teach through interaction rather than requiring upfront reading.

Design for dialogue, not documentation.