# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. --- ## Project Overview **AiWorker** is an AI agent orchestration platform that uses Claude Code agents running in Kubernetes pods to autonomously complete development tasks. The system manages a full workflow from task creation to production deployment. **Core Flow**: Task → Agent (via MCP) → Code → PR → Preview Deploy → Approval → Staging → Production **Current Status**: Infrastructure complete (K8s HA cluster), backend initialized (20% done), frontend and agents pending. --- ## Architecture ### Three-Tier System 1. **Infrastructure Layer**: K3s HA cluster (8 VPS servers in Houston) - 3 control planes with etcd HA - 3 workers with Longhorn distributed storage (3 replicas) - 2 HAProxy load balancers for HTTP/HTTPS - Private network (10.100.0.0/24) for inter-node communication 2. **Platform Layer**: MariaDB, Redis, Gitea, ArgoCD - MariaDB 11.4 LTS with HA storage (database: `aiworker`) - Gitea 1.25.3 with built-in container registry - Gitea Actions for CI/CD (runner in K8s) - TLS automatic via Cert-Manager + Let's Encrypt 3. **Application Layer**: Backend (Bun), Frontend (React), Agents (Claude Code pods) - Backend uses **Bun.serve()** native API (NOT Express despite dependency) - Drizzle ORM with auto-migrations on startup - MCP protocol for agent communication ### Data Model (Drizzle schema in `backend/src/db/schema.ts`) - **projects**: User projects linked to Gitea repos and K8s namespaces - **agents**: Claude Code pods running in K8s (status: idle/busy/error/offline) - **tasks**: Development tasks with state machine (backlog → in_progress → needs_input → ready_to_test → approved → staging → production) Relations: projects → many tasks, tasks → one agent, agents → one current task --- ## Development Commands ### Backend (Bun 1.3.6) ```bash cd backend # Development with hot-reload bun run dev # Start production bun run start # Database migrations bun run db:generate # Generate new migration from schema changes bun run db:migrate # Apply migrations (also runs on app startup) bun run db:studio # Visual database explorer # Code quality bun run lint bun run format ``` **IMPORTANT**: Use Bun native APIs: - `Bun.serve()` for HTTP server (NOT Express) - `Bun.sql()` or `mysql2` for MariaDB (decision pending) - Native WebSocket support in `Bun.serve()` - `.env` is auto-loaded by Bun ### Kubernetes Operations ```bash # Set kubeconfig (ALWAYS required) export KUBECONFIG=~/.kube/aiworker-config # Cluster status kubectl get nodes kubectl get pods -A # Deploy to K8s kubectl apply -f k8s/backend/ kubectl apply -f k8s/frontend/ # Logs kubectl logs -f -n control-plane deployment/backend kubectl logs -n gitea gitea-0 kubectl logs -n gitea-actions deployment/gitea-runner -c runner ``` ### CI/CD Workflow Push to main branch triggers automatic build: 1. Git push → Gitea receives webhook 2. Gitea Actions Runner (in K8s) picks up job 3. Docker build inside runner pod (DinD) 4. Push to `git.fuq.tv/admin/:latest` 5. View progress: https://git.fuq.tv/admin/aiworker-backend/actions **Registry format**: `git.fuq.tv//:` --- ## Critical Architecture Details ### Database Migrations **Migrations run automatically on app startup** in `src/index.ts`: ```typescript await runMigrations() // First thing on startup await testConnection() ``` **Never** manually port-forward to run migrations. The app handles this in production when pods start. ### Bun.serve() Routing Pattern Unlike Express, Bun.serve() uses a single `fetch(req)` function: ```typescript Bun.serve({ async fetch(req) { const url = new URL(req.url) if (url.pathname === '/api/health') { return Response.json({ status: 'ok' }) } if (url.pathname.startsWith('/api/projects')) { return handleProjectRoutes(req, url) } return new Response('Not Found', { status: 404 }) } }) ``` Route handlers should be organized in `src/api/routes/` and imported into main fetch. ### MCP Communication Flow Agents communicate via Model Context Protocol: 1. Agent calls MCP tool (e.g., `get_next_task`) 2. Backend MCP server (port 3100) handles request 3. Backend queries database, performs actions 4. Returns result to agent 5. Agent continues work autonomously MCP tools to implement (see `docs/05-agents/mcp-tools.md`): - `get_next_task`, `update_task_status`, `ask_user_question`, `create_branch`, `create_pull_request`, `trigger_preview_deploy` ### Preview Environments Each task gets isolated namespace: `preview-task-{taskId}` - Auto-deploy on PR creation - Accessible at `task-{shortId}.r.fuq.tv` - Auto-cleanup after 7 days (TTL label) --- ## Key Environment Variables **Backend** (`.env` file): ```bash # Database (MariaDB in K8s) DB_HOST=mariadb.control-plane.svc.cluster.local DB_USER=aiworker DB_PASSWORD=AiWorker2026_UserPass! DB_NAME=aiworker # Redis REDIS_HOST=redis.control-plane.svc.cluster.local # Gitea GITEA_URL=https://git.fuq.tv GITEA_TOKEN=159a5de2a16d15f33e388b55b1276e431dbca3f3 # Kubernetes K8S_IN_CLUSTER=false # true when running in K8s K8S_CONFIG_PATH=~/.kube/aiworker-config ``` **Local development**: Port-forward services from K8s ```bash kubectl port-forward -n control-plane svc/mariadb 3306:3306 & kubectl port-forward -n control-plane svc/redis 6379:6379 & ``` --- ## Important Constraints ### Storage HA Strategy All stateful data uses Longhorn with **3 replicas** for high availability: - MariaDB PVC: 20Gi replicated across 3 workers - Gitea PVC: 50Gi replicated across 3 workers - Can tolerate 2 worker node failures without data loss ### DNS and Domains All services use `*.fuq.tv` with DNS round-robin pointing to 2 load balancers: - `api.fuq.tv` → Backend API - `app.fuq.tv` → Frontend dashboard - `git.fuq.tv` → Gitea - `*.r.fuq.tv` → Preview environments (e.g., `task-abc.r.fuq.tv`) Load balancers (108.165.47.221, 108.165.47.203) run HAProxy balancing to worker NodePorts. ### Namespace Organization - `control-plane`: Backend API, MariaDB, Redis - `agents`: Claude Code agent pods - `gitea`: Git server - `gitea-actions`: CI/CD runner with Docker-in-Docker - `preview-*`: Temporary namespaces for preview deployments --- ## Documentation Structure Extensive documentation in `/docs` (40+ files): - **Start here**: `ROADMAP.md`, `NEXT-SESSION.md`, `QUICK-REFERENCE.md` - **Infrastructure**: `CLUSTER-READY.md`, `AGENT-GUIDE.md`, `TROUBLESHOOTING.md` - **Gitea**: `GITEA-GUIDE.md` - Complete guide for Git, Registry, API, CI/CD, and webhooks - **Detailed**: `docs/01-arquitectura/` through `docs/06-deployment/` **For agent AI operations**: Read `AGENT-GUIDE.md` - contains all kubectl commands and workflows needed to manage the cluster autonomously. **For Gitea operations**: Read `GITEA-GUIDE.md` - complete API usage, registry, tokens, webhooks, and CI/CD setup. **For credentials**: See `CLUSTER-CREDENTIALS.md` (not in git, local only) --- ## Next Development Steps Current phase: **Backend API implementation** (see `NEXT-SESSION.md` for detailed checklist) Priority order: 1. Verify CI/CD build successful → image in registry 2. Implement REST API routes (`/api/projects`, `/api/tasks`, `/api/agents`) 3. Implement MCP Server (port 3100) for agent communication 4. Integrate Gitea API client (repos, PRs, webhooks) 5. Integrate Kubernetes client (create namespaces, deployments, ingress) 6. Deploy backend to K8s at `api.fuq.tv` Frontend and agents come after backend is functional. --- ## External References - **Lucia Auth** (for React frontend): https://github.com/lucia-auth/lucia - **Vercel Agent Skills** (for React frontend): https://github.com/vercel-labs/agent-skills - **Gitea API**: https://git.fuq.tv/api/swagger - **MCP SDK**: `@modelcontextprotocol/sdk` documentation --- ## Deployment Flow ### Backend Deployment ``` Code change → Git push → Gitea Actions → Docker build → Push to git.fuq.tv → ArgoCD sync → K8s deploy ``` ### Agent Deployment ``` Backend creates pod → Agent starts → Registers via MCP → Polls for tasks → Works autonomously → Reports back ``` ### Preview Deployment ``` Agent completes task → Create PR → Trigger preview → K8s namespace created → Deploy at task-{id}.r.fuq.tv → User tests ``` --- **Read `NEXT-SESSION.md` for detailed next steps. All credentials and cluster access info in `QUICK-REFERENCE.md`.**