Self-Improving Company with AI

🚀 The core idea

Most companies are organised like Roman legions — nested hierarchies where humans are the conduit for information moving up and down. Jared Friedman’s argument (leaning on Jack Dorsey and Diana’s earlier talk) is that AI breaks that model. The right mental model is not a co-pilot bolted onto existing workflows that makes engineers 20–30% more productive. It is a company rebuilt as a set of recursive self-improving AI loops that keep getting better while you sleep.

The unit of value shifts from headcount to domain knowledge made legible to AI plus the token budget spent running loops over it. Software becomes ephemeral, context becomes the durable asset.

Closely connected to ideas in Context Engineering, Agentic Coding, Progressive Disclosure, LLM Knowledge Bases and gstack (Garry Tan’s own YC-flavoured Claude Code stack).

🗒️ Description

The talk is conceptual, ~13 minutes, structured around 12 chapters:

  1. Companies are Roman legions — hierarchies built to project power; humans as message bus.
  2. Co-pilots are the wrong mental model — strapping a faster engine onto the old machine.
  3. Extract the domain knowledge — the knowhow in heads, Slack, email, Notion is the company.
  4. The recursive self-improving loop — sensor → policy → tools → quality gate → learning, looping back.
  5. The holy-shit moment at YC — a monitoring agent watched every query, identified failures overnight, wrote code, opened the PR, had it reviewed and merged, and by morning the same query succeeded.
  6. Self-optimizing product and support loops — agent analyses funnel friction, designs an A/B test, runs it, ships the winner; or triages customer suggestions like a CPO/CTO and ships overnight.
  7. Burn tokens, not headcount — YC companies hitting demo day with ~5× revenue per employee vs 18 months ago; constraint is shifting from headcount to token usage.
  8. Middle management is over — two roles matter: IC builder/operator + a single DRI (named human, not a committee).
  9. Make everything legible to AI — record everything: emails, Slack, DMs, office hours. “If it wasn’t recorded, it didn’t happen to your intelligence.”
  10. Regenerating the YC user manual — Haj rebuilt the 5-10 year-old user manual over a weekend from 2,000 hours of recorded office hours; now a living, monthly-updated 150-page “brain” piped into an agent for the combined wisdom of 16 YC partners.
  11. Software is ephemeral, context is valuable — dashboards and internal tools are one-shottable now; preserve the data and the instructions, regenerate the software when the next model lands.
  12. Where humans still matter — the edge where the company touches reality: novel situations, ethics, high-stakes/high-emotion moments (co-founder breakups, sales conversations).

🧩 The recursive AI loop (5 layers)

LayerWhat it isYC example
SensorInputs from the worldFounder emails, support tickets, cancellations, telemetry, office-hour recordings
Policy / decisionRules about what is allowed, what needs human approval, what must be loggedYC partner approval gates
ToolsDeterministic APIs/skills the AI can call”Query our founder DB”, “look at my calendar”, introduction-finder over the YC graph
Quality gateEvals, safety filters, human review for riskPR review agent before code merges
LearningWatch where it fails → propose new tool/skill/index/view → patch → redeployMonitoring agent that rewrites failed queries overnight

The whole loop must be runnable with minimal human intervention for the self-improving property to kick in.

☘️ Implications for how to operate

  • Burn tokens, not headcount. Directionally measure per-person token usage — gameable as a KPI, but useful as a signal of who is “token-maxing” and who isn’t.
  • Two roles only. Everyone is an IC / builder / operator, with a single DRI per outcome. Middle management is over.
  • Record everything. Partner emails into the DB, Slack messages, DMs, office hours recorded for months. Wire up rooms with mics; consider on-body capture (phones, smart glasses) — if it wasn’t recorded, it isn’t part of the company brain.
  • Diorize and synthesize. You can’t shove 100,000 hours of audio into a context window. Aggregate down, leave breadcrumbs the AI can navigate (see also Progressive Disclosure).
  • Artifact rule. If an interaction creates an artifact that can self-improve, it’s legible. If it doesn’t, throw it away.
  • Software is disposable, context is precious. “Codex 55” can one-shot most internal dashboards. Store the data and original instructions preciously; regenerate the software whenever models get smarter.
  • If you were starting today — would you build the company in this shape? For early-stage teams, “no excuse” not to.

✍️ Quotes worth keeping

“If it is recorded, it happened to the AI. If it did not get recorded, it did not happen to your intelligence.”

“You can just throw tokens at this problem and your company will get better.”

“Software is ephemeral; context is valuable.”

“Humans sit around the edge of this — where the intelligence makes contact with reality.”

📒 Takeaways

  • The next leverage isn’t a smarter co-pilot, it’s a loop that closes overnight — failure detected, code written, PR reviewed, deployed, problem gone by morning.
  • The durable moat for a small team is the company brain: every email, every conversation, every decision captured and synthesizable into context and skills.
  • Tooling and dashboards become disposable. Don’t over-invest in internal software — invest in the data + skills + instructions layer underneath.
  • Organisationally: collapse to ICs + DRIs; drop middle management; reserve humans for novel/ethical/high-stakes contact with reality.
  • Direct mapping to my own setup: this is the same principle behind Brain (this vault) and Personal AI Infrastructure — context as the asset, skills/agents as the renewable layer on top.

📖 Further reading / watching


Template: knowledge_note_info