What This Is, Right Now
Foundry is a multi-tenant agentic delivery platform that transforms requirements into working code. It structures delivery knowledge and powers AI agents that reason about full project context, decompose tasks, generate code in sandboxes, and push changes autonomously.
Agencies and delivery teams feed in plans and conversations. Foundry decomposes them into structured requirements, reasons about implementation, provisions AI sandboxes, and ships code to repos.
Stage: Active development (v0.1.0). Solo founder build by Quintin Henry. Reference client: Burlington Medical (118 requirements, 8 skills, 7 workstreams). Engagement-type agnostic—platform migrations, greenfield builds, system integrations.
Frontend: Next.js 16 + React 19 + Tailwind 4.1
Backend: Convex (reactive BaaS, 55+ tables)
Auth: Clerk (multi-tenant orgs)
AI: Claude (Opus 4.6 / Sonnet 4.5)
Sandbox: Cloudflare Workers + Docker
Desktop: Tauri 2 (Rust + Vite)
268 Convex backend files
405 UI component files (34 domains)
562 Next.js app files (52 routes)
2,794 lines in schema.ts
System as It Exists Today
Four-process distributed system in development, managed platform deployment in production. The diagram shows conceptual modules and data flow, not individual files.
52 routes, thin wrappers"] B["Tauri Desktop
Same @foundry/ui components"] end subgraph Backend["Convex Cloud"] C["Schema
55+ tables, 6 domains"] D["Server Functions
Queries, mutations, actions"] E["AI Actions
Claude API, context assembly"] F["Webhooks
GitHub, Atlassian, Clerk"] end subgraph AILayer["AI Inference"] G["Agent Worker
Hono + Anthropic SDK"] H["Analysis Routes
/analyze-requirement
/analyze-task-subtasks"] end subgraph Sandbox["Sandbox System"] I["Sandbox Worker
Durable Objects"] J["Docker Containers
Ephemeral Claude Code envs"] end subgraph External["External Services"] K["GitHub App
Repos, PRs, webhooks"] L["Clerk
Auth, orgs, JWT"] M["Claude API
3-tier model deployment"] end A -->|"WebSocket"| D B -->|"WebSocket"| D D --> C D --> E E --> G G --> H D -->|"HTTP"| I I --> J F --> D K --> F L --> A L --> D E --> M style Client fill:#eff6ff,stroke:#2563eb,color:#0f172a style Backend fill:#f8fafc,stroke:#3b82f6,color:#0f172a style AILayer fill:#fefce8,stroke:#d97706,color:#0f172a style Sandbox fill:#f0fdf4,stroke:#16a34a,color:#0f172a style External fill:#f1f5f9,stroke:#94a3b8,color:#0f172a
What Happened and Why
Six major workstreams in two weeks, grouped by theme. The dominant pattern: deepening AI integration and building the observability layer for agent-driven delivery.
repositoryIds field to tasks and workstreams tables. RepoBadge component, RepoCreateModal, RepoPickerDropdown, settings page. GitHub repos now visible everywhere work is managed.AI Observability
Three of the four PRs (#30, #31, #32) add visibility into what AI agents are doing. The codebase analysis feature closes the loop: AI now analyzes its own implementation progress against requirements.
Quality Infrastructure
Biome enforcement, test coverage spec, audit trail instrumentation. The codebase is shifting from "build fast" to "build with guardrails."
- Google Drive import source (PR #27)
- Mission Control consolidation (PR #26)
- Service resilience phases 1–3 (PR #24)
- Sprint usability updates (PR #25)
- UX overhaul (PR #23)
- Billing system (PR #22)
Why Things Are the Way They Are
Key design decisions from this window. Extracted from commit messages and planning docs. This is the highest-value section for fighting cognitive debt.
Biome over ESLint + Prettier
Single tool for format + lint across all 6 workspaces. Enforced via PostToolUse hook (auto-fixes on every edit) and pre-commit hook (blocks errors).
Dashboard-first for Agent Activity
Landing page is now health metrics (acceptance rate, velocity, token spend, coverage), not a chronological list. Trace drill-down groups executions by requirement.
AI analysis with human review queue
Codebase analysis runs Claude against repos, but results go through a review queue before updating requirement status. Batch approve/reject supported.
repositoryIds as arrays, not single values
Tasks and workstreams can reference multiple GitHub repositories. RepoBadge shows the primary, picker allows multi-select.
Design context as cascading pipeline
Design tokens cascade program > workstream > requirement with merge semantics. Snapshots are immutable—created at task creation time.
4-agent parallel strategy for test coverage
Domain-clustered agents: source-control+pipeline, discovery+audit, tasks+programs+skills, videos+sandbox+layout. Each agent writes ~30–50 test files.
Working, In Progress, Broken, Blocked
- Core delivery pipeline (requirements, skills, tasks, workstreams)
- Sandbox execution system (10-stage provisioning, Docker containers)
- Agent Activity dashboard with audit trail
- Design context pipeline with AI vision analysis
- Repository picker across tasks and workstreams
- Task verification pipeline
- Google Drive import source
- Service resilience layer (auto-reconnect, health monitoring)
- Billing system (3 tiers)
- Biome lint + format enforcement
- GitHub App + Atlassian integrations
- Clerk multi-tenant auth with row-level security
- Codebase analysis — on
development, not yet merged tomain - Semantic code search — on
semantic-code-searchbranch, adds vector embeddings and cosine similarity search - Test coverage initiative — spec written, 4-agent strategy designed, not yet executed
- Test coverage at 28% — 153 of 261 source files in apps/web have zero tests. 46 tests total in packages/ui for 405 source files.
- 12 unmerged branches — accumulating stale feature branches that may need cleanup
The 10 Things to Hold in Your Head
Key invariants, non-obvious coupling, and gotchas that will bite you if you forget them.
-
Every query must use
.withIndex(), never.filter(). Convex filter causes full table scans and kills reactive performance. Define indexes inschema.tsfor every query pattern. -
Clerk wraps Convex, never the reverse. The Convex client needs the Clerk JWT. Breaking the provider nesting order breaks authentication silently.
-
All feature UI lives in
packages/ui/, notapps/web/. Page files are 3–7 line wrappers. If you add logic to a page file, you break the shared component model with the desktop app. -
Mutations cannot call Node.js APIs. Only
actionscan use Node.js APIs. Utility files shared between mutations and actions need separate entry points. If a shared util uses"use node", importing it from a mutation will fail. -
assertOrgAccess()is mandatory on every query and mutation. Row-level security. Skip it and you get cross-tenant data leaks. Exception: health check endpoints must skip auth because they run before Clerk initializes. -
paramsandsearchParamsare Promises in Next.js 16. Mustawaitthem. Alsoheaders()andcookies()are async. Use the"skip"token onuseQuerywhen auth state hasn't resolved. -
Sandbox orchestrator is 4,142 lines with a formal state machine. The
ALLOWED_TRANSITIONSmap governs all lifecycle changes. Don't add transitions without updating the map—the system will silently reject them. -
Webhooks follow the durable event buffer pattern. Store raw event →
scheduler.runAfter(0)for async processing → return 200 OK immediately. Failed operations get exponential backoff retry (up to 5 attempts, 1h cap). -
Design context cascades then snapshots. Program > workstream > requirement merge. Snapshots are immutable—created at task creation. Don't mutate a snapshot expecting sandboxes to pick up the change.
-
Never use purple/violet in UI. Design system rule. Blue/slate palette for AI features and interactive elements. Enforced by code review and Biome (informally).
Where Understanding Is Weakest
Areas where the code changed faster than documentation and tests could follow. Each flagged with severity and a concrete action.
convex/sandbox/orchestrator.ts — 4,142 lines, 3 changes in 2 weeks
The largest file in the codebase. Contains the 10-stage sandbox provisioning state machine, session management, fleet orchestration, and auto-commit logic. Changed 3 times this window but has no inline documentation for the state machine transitions.
orchestrator.ts documenting the 10 provisioning stages and the ALLOWED_TRANSITIONS map. Extract the state machine into a separate stateMachine.ts module.Codebase analysis feature — 11 UI components, 0 test files
The newest and largest feature has zero tests. packages/ui/src/codebase-analysis/ has 11 components (ReviewQueue, AnalysisConfigPanel, TaskAnalysisPanel, etc.) with no coverage. The agent worker routes (/analyze-requirement, /analyze-task-subtasks) are also untested.
Activity page rebuild — 14 files, 0 tests
Complete rewrite of the Agent Activity page with 14 component files in packages/ui/src/activity/. Dashboard metrics, trace drill-down, audit trail sections, coverage detail. All zero tests despite being the primary monitoring surface.
convex/schema.ts — 2,794 lines, 10 changes in 2 weeks
The single source of truth for the data model is approaching 3,000 lines. Every feature adds tables and indexes here. The file was changed 10 times in 2 weeks—the highest-churn file in the codebase.
semantic-code-search branch diverging from development
This branch adds vector embeddings, cosine similarity search, and analysis UX improvements. It's been open while codebase analysis features shipped on development. Merge distance is growing.
development. The longer it stays diverged, the more painful the merge—especially since both branches touch the analysis feature.12 unmerged feature branches accumulating
Branches like feat/add-design-analysis, fix-agent-logs, ubiquitous-github-picker appear to be stale (work merged via other branch names). They add noise to branch listings.
git branch --merged development | grep -v main | grep -v development to find candidates.Where Momentum Was Pointing
Inferred from recent activity, open specs, and project trajectory. Not prescriptive—just the direction of travel.
Execute test coverage initiative
Spec is written at spec.md. Four parallel builder agents, domain-clustered. Target: 28% → 90% across apps/web + packages/ui. Pre-commit gate + PostToolUse hook to enforce afterwards.
Merge semantic code search
The semantic-code-search branch adds vector embeddings to replace GitHub code search API in requirement analysis. Should be merged before the branches diverge further.
Merge development → main
12 PRs have been merged to development but not yet promoted to main. The gap between the branches represents the full body of work from this 2-week window.
Orchestrator decomposition
The 4,142-line sandbox orchestrator is the biggest risk to maintainability. Extract the state machine, provisioning stages, and fleet management into focused modules before the next feature touches it.
Deepening AI integration
The codebase analysis feature is the seed of a closed-loop system: requirements → AI analysis → implementation status → agent task assignment → sandbox execution → PR. The next features likely close more gaps in this loop.