GitHire

AI-native engineering · A new way of working

Humans think, AI executes, humans verify.

A public operating log: six-step workflow, real incident case, installable Skill. For anyone who wants to see how an AI-native team actually works.

What is GitHire · In one line

GitHire is an AI-native engineering method. It codifies "humans think, AI executes, humans verify" into a reusable six-step workflow, validates it with real production incidents, and packages it as a Skill any agent can install.

Issue-first
Every task starts from an Issue that someone can articulate: the spec is the doc. Every blank in the spec is a blank check for the AI.
Human-orchestrated
AI builds, opens PRs, and cross-reviews in a sandbox; humans keep framing, architectural direction, and the merge call. AI executes, humans decide.
Production-bound
The end state is a PR merged to main, not a demo, not an exercise. Reusable Skills and decisions are written back to the Issue as the starting point for the next round.

STEP 01 / 06

Start with an
Issue.

Spell out 'whom you are solving what for' in a sentence or two; only specs you can articulate deserve to move forward. The Issue is both the kickoff and the archive of this work.

ISSUE · Frame the problem.

STEP 02 / 06

Enter the dev
sandbox.

A sandbox is a long-running dev environment where dependencies, caches, and real data stay put, so the AI works in actual context, not a throwaway container.

SANDBOX · A persistent dev environment.

STEP 03 / 06

Execute and
open a PR.

Hand the Issue to Claude Code or Codex. It builds inside the sandbox (writes code, runs tests, iterates on failures) and ends with a PR.

EXECUTE · Claude Code or Codex.

STEP 04 / 06

A second AI
reviews the PR.

Hand the PR to a different Claude Code or Codex instance. It reads the whole diff alongside CI and static checks, and returns actionable feedback: two agents cross-reviewing one PR.

AI REVIEW · Two agents, one PR.

STEP 05 / 06 · the key step

Human architect
reviews.

The pivot of the whole workflow. A human architect decides whether this change matches the system direction, whether it adds hidden debt, whether it deserves to ship. AI surfaces signals; humans decide.

ARCHITECT · Where humans decide.

STEP 06 / 06

Merge and
ship.

Once the architect signs off, the PR merges into main and deploys to production. Reusable Skills, checklists, and review notes are written back to the Issue: the kickoff for the next round.

PRODUCTION · Ship it.

Overall flow · End to end

Issue · Sandbox · AI · Architect · Ship
one complete change, delivered.

  1. 01 Issue
  2. 02 Sandbox
  3. 03 PR
  4. 04 AI Review
  5. 05 Architect
  6. 06 Production

Six steps in a pipeline · from spec to ship · every decision and artifact written back to the Issue

Walk these six steps through a real production incident ↗

Rituals · habits that make it work

Beyond the pipeline,
the small rituals that make it work.

Process answers 'what'; rituals answer 'at what rhythm'. These four habits are the minimum set we use to keep AI-native collaboration actually running.

  • RITUAL

    The shape of a day

    THE SHAPE OF A DAY

    Spend a minute every morning writing down the single most important Issue for the day. Not a todo list, but an anchor. All conversation, all changes, all PRs orbit it.

    MORNING
    Write #today, the most important Issue for today
    DURING
    Every change hangs off this Issue
    EVENING
    Write back today’s progress or decision in one sentence
  • RITUAL

    Architects and developers

    ARCHITECT & DEVELOPERS

    The architect owns the final call on direction; developers build around the Issue and talk to the architect through Issues and PRs. Reviews and decisions live on GitHub; chat tools are reserved for things that have to align right now.

    ARCHITECT
    Final call on direction and boundaries
    DEVELOPER
    Build around Issues · talk in PRs
    CHANNEL
    Reviews on GitHub · live sync on chat
  • RITUAL

    Every Issue is tracked

    EVERY ISSUE IS TRACKED

    Every Issue is mirrored by a PR that tracks its execution, and the PR comments are kept verbatim, so the next person who picks it up can read straight through from Issue to PR and reconstruct how the call was made.

    LINK
    One Issue → one PR · 1:1
    KEEP
    PR comments preserved as decision record
    READ
    Newcomers reconstruct history from Issue → PR
  • RITUAL

    From prompt to Skill

    FROM PROMPT TO SKILL

    A prompt you have reused enough to remember deserves to become a Skill: named, described, installable by any agent. This is the artifact most worth carrying forward.

    SKILL
    GitHire · six-step workflow
    SPEC
    Prompt Spec · six-section Issue template
    USE
    npx skills add realRoc/skills

FAQ · About humans-orchestrate-AI

AI writes fast,
that is not the same as humans thinking clearly.

These are the questions most easily skipped, and the ones GitHire actually exists to answer.

AI writes code in minutes. Why bother with six steps?

Because speed isn't the problem. Code AI writes in five minutes still takes a human thirty to recognise 'wrong path.' The six steps aren't guardrails for AI; they're decision points for humans. The Issue is the framing decision; architect review is the direction decision; sandbox and AI review are two more 'still time to back out' decisions. Skip them, and AI's speed turns into the speed of incidents. See the cost of skipping a decision point →

The architect spots it in 30 seconds. Why does AI review miss it after 5 minutes?

They look at different dimensions. AI review reads the 'context the code carries with it': internal consistency of the PR, edge cases, naming, ignored nullability. The architect reads the 'context only the system has': QPS curves, past incidents, capacity plans. Complementary, not overlapping. In Case 01 those 22 lines of SCAN looked perfectly correct to AI review, but the architect saw in 30 seconds that 'this endpoint scans the entire keyspace on every request' is unacceptable.

What does a good Issue look like?

Six sections. Goal: what you are solving. Constraints: what cannot move. Non-goals: what we are not doing. Verification: how we know it worked. Architecture notes: system boundaries. Existing context: what is already there. Drop a section and the AI freelances inside it. See the same spec written two ways in Case 01: BAD vs GOOD →.

When AI ships an incident, who carries it?

The architect. AI does not carry it; carrying implies agency, and AI has no agency. Every change that merges to main has an architect's signature, and that signature is an explicit claim on the system-side consequences. GitHire does not let review accountability get diluted by 'an AI looked at it.' AI review is supporting evidence; architect review is the decision.

How is conceptual integrity preserved when AI is on the team?

Through an explicit architecture owner. AI agents can generate PRs, but the Architect step is held by a human. Every PR must be summarisable by one architect in one sentence: motivation and trade-off. A PR that fails this test does not merge. The constraint keeps conceptual integrity from being diluted by however many agents are running in parallel.

Why does sandbox code not land in main by default?

Because the value of the first cut is clarifying the problem, not shipping. GitHire makes Brooks’s "Plan to throw one away; you will, anyhow" explicit: sandbox code is not destined for main by default; its job is to validate the direction and surface unknowns. By the time real code is written for main, the problem is clear enough that AI has a chance of getting it right in one shot.

Should a team that is falling behind add people?

The classic version is Brooks's Mythical Man-Month: adding people to a late project makes it later: newcomers have to learn context, communication cost grows quadratically, and progress slows in the short term. The law still holds in an AI-native team, only the scarce resource has changed. Coding hands aren't scarce: AI is already fast enough. What is scarce are the heads that can frame an Issue and make architectural calls. One person who can write a six-section Issue can orchestrate several agents; hiring another coder just dilutes direction. Before adding people, ask: is framing the bottleneck, or is throughput?

Is the 'person-month' a fiction?

The 'person-month' was the traditional estimation unit: 'this feature is six person-months.' Brooks already noted it is a rough approximation: people and months are not linearly interchangeable, and doubling headcount does not double speed. In an AI-native team it is not just rough, it has broken down. Person-month models coding effort, but coding is no longer the bottleneck: an AI agent ships a former person-month of code in five minutes. Meanwhile one badly framed Issue makes the AI produce 22 lines of SCAN (Case 01) and costs 25 hours of firefighting, and that part doesn't fit person-month at all. The new unit is 'decision-point cadence': framings per week, architect reviews per week, sandbox → PR loops closed per week.

Why do deadlines keep lying?

Software projects almost never ship on time. The classic explanation is that estimates are made when information is thinnest and promises are made when pressure is highest, and neither side is reliable. Deadlines still miss in an AI-native team, but the root cause has shifted: they used to slip because coding was slow; now coding isn't slow, and they slip because of bad framing and review not catching up. A poorly written Issue sends AI down the wrong path for 20 minutes and costs 25 hours to fully fix (Case 01). GitHire does not promise a deadline. It promises 'decision-point completion': Issue framed / architect signed off / sandbox validated. Observable state replaces subjective time, and the closer to delivery, the tighter the prediction.

End of tour · You’ve reached the end

AI executes,
humans frame and decide.

Stop assisting. Start operating.