Complete documentation for future sessions

- CLAUDE.md for AI agents to understand the codebase - GITEA-GUIDE.md centralizes all Gitea operations (API, Registry, Auth) - DEVELOPMENT-WORKFLOW.md explains complete dev process - ROADMAP.md, NEXT-SESSION.md for planning - QUICK-REFERENCE.md, TROUBLESHOOTING.md for daily use - 40+ detailed docs in /docs folder - Backend as submodule from Gitea Everything documented for autonomous operation. Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
2026-01-20 00:36:53 +01:00
commit db71705842
49 changed files with 19162 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,273 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+---
+
+## Project Overview
+
+**AiWorker** is an AI agent orchestration platform that uses Claude Code agents running in Kubernetes pods to autonomously complete development tasks. The system manages a full workflow from task creation to production deployment.
+
+**Core Flow**: Task → Agent (via MCP) → Code → PR → Preview Deploy → Approval → Staging → Production
+
+**Current Status**: Infrastructure complete (K8s HA cluster), backend initialized (20% done), frontend and agents pending.
+
+---
+
+## Architecture
+
+### Three-Tier System
+1. **Infrastructure Layer**: K3s HA cluster (8 VPS servers in Houston)
+   - 3 control planes with etcd HA
+   - 3 workers with Longhorn distributed storage (3 replicas)
+   - 2 HAProxy load balancers for HTTP/HTTPS
+   - Private network (10.100.0.0/24) for inter-node communication
+
+2. **Platform Layer**: MariaDB, Redis, Gitea, ArgoCD
+   - MariaDB 11.4 LTS with HA storage (database: `aiworker`)
+   - Gitea 1.25.3 with built-in container registry
+   - Gitea Actions for CI/CD (runner in K8s)
+   - TLS automatic via Cert-Manager + Let's Encrypt
+
+3. **Application Layer**: Backend (Bun), Frontend (React), Agents (Claude Code pods)
+   - Backend uses **Bun.serve()** native API (NOT Express despite dependency)
+   - Drizzle ORM with auto-migrations on startup
+   - MCP protocol for agent communication
+
+### Data Model (Drizzle schema in `backend/src/db/schema.ts`)
+- **projects**: User projects linked to Gitea repos and K8s namespaces
+- **agents**: Claude Code pods running in K8s (status: idle/busy/error/offline)
+- **tasks**: Development tasks with state machine (backlog → in_progress → needs_input → ready_to_test → approved → staging → production)
+
+Relations: projects → many tasks, tasks → one agent, agents → one current task
+
+---
+
+## Development Commands
+
+### Backend (Bun 1.3.6)
+```bash
+cd backend
+
+# Development with hot-reload
+bun run dev
+
+# Start production
+bun run start
+
+# Database migrations
+bun run db:generate  # Generate new migration from schema changes
+bun run db:migrate   # Apply migrations (also runs on app startup)
+bun run db:studio    # Visual database explorer
+
+# Code quality
+bun run lint
+bun run format
+```
+
+**IMPORTANT**: Use Bun native APIs:
+- `Bun.serve()` for HTTP server (NOT Express)
+- `Bun.sql()` or `mysql2` for MariaDB (decision pending)
+- Native WebSocket support in `Bun.serve()`
+- `.env` is auto-loaded by Bun
+
+### Kubernetes Operations
+```bash
+# Set kubeconfig (ALWAYS required)
+export KUBECONFIG=~/.kube/aiworker-config
+
+# Cluster status
+kubectl get nodes
+kubectl get pods -A
+
+# Deploy to K8s
+kubectl apply -f k8s/backend/
+kubectl apply -f k8s/frontend/
+
+# Logs
+kubectl logs -f -n control-plane deployment/backend
+kubectl logs -n gitea gitea-0
+kubectl logs -n gitea-actions deployment/gitea-runner -c runner
+```
+
+### CI/CD Workflow
+Push to main branch triggers automatic build:
+1. Git push → Gitea receives webhook
+2. Gitea Actions Runner (in K8s) picks up job
+3. Docker build inside runner pod (DinD)
+4. Push to `git.fuq.tv/admin/<repo>:latest`
+5. View progress: https://git.fuq.tv/admin/aiworker-backend/actions
+
+**Registry format**: `git.fuq.tv/<owner>/<package>:<tag>`
+
+---
+
+## Critical Architecture Details
+
+### Database Migrations
+**Migrations run automatically on app startup** in `src/index.ts`:
+```typescript
+await runMigrations()  // First thing on startup
+await testConnection()
+```
+
+**Never** manually port-forward to run migrations. The app handles this in production when pods start.
+
+### Bun.serve() Routing Pattern
+Unlike Express, Bun.serve() uses a single `fetch(req)` function:
+```typescript
+Bun.serve({
+  async fetch(req) {
+    const url = new URL(req.url)
+
+    if (url.pathname === '/api/health') {
+      return Response.json({ status: 'ok' })
+    }
+
+    if (url.pathname.startsWith('/api/projects')) {
+      return handleProjectRoutes(req, url)
+    }
+
+    return new Response('Not Found', { status: 404 })
+  }
+})
+```
+
+Route handlers should be organized in `src/api/routes/` and imported into main fetch.
+
+### MCP Communication Flow
+Agents communicate via Model Context Protocol:
+1. Agent calls MCP tool (e.g., `get_next_task`)
+2. Backend MCP server (port 3100) handles request
+3. Backend queries database, performs actions
+4. Returns result to agent
+5. Agent continues work autonomously
+
+MCP tools to implement (see `docs/05-agents/mcp-tools.md`):
+- `get_next_task`, `update_task_status`, `ask_user_question`, `create_branch`, `create_pull_request`, `trigger_preview_deploy`
+
+### Preview Environments
+Each task gets isolated namespace: `preview-task-{taskId}`
+- Auto-deploy on PR creation
+- Accessible at `task-{shortId}.r.fuq.tv`
+- Auto-cleanup after 7 days (TTL label)
+
+---
+
+## Key Environment Variables
+
+**Backend** (`.env` file):
+```bash
+# Database (MariaDB in K8s)
+DB_HOST=mariadb.control-plane.svc.cluster.local
+DB_USER=aiworker
+DB_PASSWORD=AiWorker2026_UserPass!
+DB_NAME=aiworker
+
+# Redis
+REDIS_HOST=redis.control-plane.svc.cluster.local
+
+# Gitea
+GITEA_URL=https://git.fuq.tv
+GITEA_TOKEN=159a5de2a16d15f33e388b55b1276e431dbca3f3
+
+# Kubernetes
+K8S_IN_CLUSTER=false  # true when running in K8s
+K8S_CONFIG_PATH=~/.kube/aiworker-config
+```
+
+**Local development**: Port-forward services from K8s
+```bash
+kubectl port-forward -n control-plane svc/mariadb 3306:3306 &
+kubectl port-forward -n control-plane svc/redis 6379:6379 &
+```
+
+---
+
+## Important Constraints
+
+### Storage HA Strategy
+All stateful data uses Longhorn with **3 replicas** for high availability:
+- MariaDB PVC: 20Gi replicated across 3 workers
+- Gitea PVC: 50Gi replicated across 3 workers
+- Can tolerate 2 worker node failures without data loss
+
+### DNS and Domains
+All services use `*.fuq.tv` with DNS round-robin pointing to 2 load balancers:
+- `api.fuq.tv` → Backend API
+- `app.fuq.tv` → Frontend dashboard
+- `git.fuq.tv` → Gitea
+- `*.r.fuq.tv` → Preview environments (e.g., `task-abc.r.fuq.tv`)
+
+Load balancers (108.165.47.221, 108.165.47.203) run HAProxy balancing to worker NodePorts.
+
+### Namespace Organization
+- `control-plane`: Backend API, MariaDB, Redis
+- `agents`: Claude Code agent pods
+- `gitea`: Git server
+- `gitea-actions`: CI/CD runner with Docker-in-Docker
+- `preview-*`: Temporary namespaces for preview deployments
+
+---
+
+## Documentation Structure
+
+Extensive documentation in `/docs` (40+ files):
+- **Start here**: `ROADMAP.md`, `NEXT-SESSION.md`, `QUICK-REFERENCE.md`
+- **Infrastructure**: `CLUSTER-READY.md`, `AGENT-GUIDE.md`, `TROUBLESHOOTING.md`
+- **Gitea**: `GITEA-GUIDE.md` - Complete guide for Git, Registry, API, CI/CD, and webhooks
+- **Detailed**: `docs/01-arquitectura/` through `docs/06-deployment/`
+
+**For agent AI operations**: Read `AGENT-GUIDE.md` - contains all kubectl commands and workflows needed to manage the cluster autonomously.
+
+**For Gitea operations**: Read `GITEA-GUIDE.md` - complete API usage, registry, tokens, webhooks, and CI/CD setup.
+
+**For credentials**: See `CLUSTER-CREDENTIALS.md` (not in git, local only)
+
+---
+
+## Next Development Steps
+
+Current phase: **Backend API implementation** (see `NEXT-SESSION.md` for detailed checklist)
+
+Priority order:
+1. Verify CI/CD build successful → image in registry
+2. Implement REST API routes (`/api/projects`, `/api/tasks`, `/api/agents`)
+3. Implement MCP Server (port 3100) for agent communication
+4. Integrate Gitea API client (repos, PRs, webhooks)
+5. Integrate Kubernetes client (create namespaces, deployments, ingress)
+6. Deploy backend to K8s at `api.fuq.tv`
+
+Frontend and agents come after backend is functional.
+
+---
+
+## External References
+
+- **Lucia Auth** (for React frontend): https://github.com/lucia-auth/lucia
+- **Vercel Agent Skills** (for React frontend): https://github.com/vercel-labs/agent-skills
+- **Gitea API**: https://git.fuq.tv/api/swagger
+- **MCP SDK**: `@modelcontextprotocol/sdk` documentation
+
+---
+
+## Deployment Flow
+
+### Backend Deployment
+```
+Code change → Git push → Gitea Actions → Docker build → Push to git.fuq.tv → ArgoCD sync → K8s deploy
+```
+
+### Agent Deployment
+```
+Backend creates pod → Agent starts → Registers via MCP → Polls for tasks → Works autonomously → Reports back
+```
+
+### Preview Deployment
+```
+Agent completes task → Create PR → Trigger preview → K8s namespace created → Deploy at task-{id}.r.fuq.tv → User tests
+```
+
+---
+
+**Read `NEXT-SESSION.md` for detailed next steps. All credentials and cluster access info in `QUICK-REFERENCE.md`.**