Files
OTSSignsOrchestrator/CLAUDE.md

126 lines
5.8 KiB
Markdown
Raw Normal View History

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
OTS Signs Orchestrator — a .NET 9.0 system for provisioning and managing Xibo CMS instances on Docker Swarm. Two projects in one solution:
- **OTSSignsOrchestrator** — ASP.NET Core API + React web UI + SignalR + Quartz scheduler. PostgreSQL 16. Contains all services, models, configuration, and business logic.
- **OTSSignsOrchestrator.Tests** — xUnit test project.
External integrations: Xibo CMS API (OAuth2), Authentik (SAML IdP), Bitwarden Secrets, Docker Swarm (SSH), Git (LibGit2Sharp), MySQL 8.4, Stripe, SendGrid, NFS volumes.
## Build & Run Commands
```bash
# Build entire solution
dotnet build
# Run application
dotnet run --project OTSSignsOrchestrator/OTSSignsOrchestrator.csproj
# Run tests
dotnet test OTSSignsOrchestrator.Tests/OTSSignsOrchestrator.Tests.csproj
# Run a single test
dotnet test OTSSignsOrchestrator.Tests --filter "FullyQualifiedName~TestClassName.TestMethodName"
# Frontend dev server (from ClientApp/)
cd OTSSignsOrchestrator/ClientApp && npm run dev
# Build frontend for production (outputs to wwwroot/)
cd OTSSignsOrchestrator/ClientApp && npm run build
# EF Core migrations
dotnet ef migrations add <Name> --project OTSSignsOrchestrator --startup-project OTSSignsOrchestrator
# Local dev PostgreSQL
docker compose -f docker-compose.dev.yml up -d
```
## Architecture
The application uses a job-queue architecture with a React web UI:
1. **React Web UI** (`ClientApp/`) — Vite + React + TypeScript + Tailwind CSS, served from `wwwroot/`. Cookie-based JWT auth (httpOnly, Secure, SameSite=Strict).
2. **REST API** (`Api/`) — Minimal API endpoint groups via `.MapGroup()` with JWT auth
3. **SignalR Hub** (`Hubs/FleetHub.cs`) — Real-time updates to web UI clients
4. **ProvisioningWorker** (`Workers/`) — Background service that polls `Jobs` table, claims jobs, resolves the correct `IProvisioningPipeline`, and executes steps
5. **Pipelines** — Each job type has a pipeline (Phase1, Phase2, BYOI SAML, Suspend, Reactivate, Decommission, etc.). Steps emit `JobStep` records broadcast via SignalR
6. **HealthCheckEngine** (`Health/`) — Background service running 16 health check types
7. **Quartz Jobs** (`Jobs/`) — Scheduled tasks (cert expiry, daily snapshots, reports)
8. **Stripe Webhooks** (`Webhooks/`) — Idempotent webhook processing
**Data flow:** Web UI creates `Job``ProvisioningWorker` claims it → pipeline runs steps → `JobStep` records broadcast via SignalR → Web UI updates in real-time.
## Project Structure
```
OTSSignsOrchestrator/
├── Api/ # Minimal API endpoint groups
├── Auth/ # JWT/auth services
├── ClientApp/ # React + Vite frontend
├── Clients/ # External API clients (Xibo, Authentik)
├── Configuration/ # AppConstants, AppOptions
├── Data/ # OrchestratorDbContext
│ └── Entities/ # EF Core entity models
├── Health/ # Health check engine + checks
├── Hubs/ # SignalR hubs
├── Jobs/ # Quartz scheduled jobs
├── Models/DTOs/ # Data transfer objects
├── Reports/ # PDF report generation
├── Services/ # Business logic + integrations
├── Webhooks/ # Stripe webhook handler
├── Workers/ # Provisioning pipelines + worker
└── wwwroot/ # Built frontend assets
```
## Critical Rules
### Xibo API — non-negotiable
- `GET /api/application` is **BLOCKED** — only POST and DELETE exist
- Group endpoints are `/api/group`, never `/api/usergroup`
- Feature assignment is `POST /api/group/{id}/acl`, NOT `/features`
- **Always pass `length=200`** and use `GetAllPagesAsync()` — default pagination is 10 items, causing silent data truncation
- OAuth2 client secret returned **ONCE** on creation — capture immediately
### Stripe webhooks — idempotency mandatory
- Check `StripeEvents` table for `stripe_event_id` before processing
- Insert the `StripeEvent` row first, then process
### No AI autonomy in infrastructure actions
- All infrastructure actions must flow through the `ProvisioningWorker` job queue via an operator-initiated `Job` record
### Immutability
`AuditLog`, `Message`, and `Evidence` are append-only. Never generate Update/Delete methods on their repositories.
### Credential handling
Never store secrets in the database. Secrets go to Bitwarden only. `OauthAppRegistry` stores `clientId` only.
## Naming Conventions
- Customer abbreviation: exactly 3 lowercase letters (`^[a-z]{3}$`)
- Stack name: `{abbrev}-cms-stack`, Service: `{abbrev}-web`, DB: `{abbrev}_cms_db`
- Secret names via `AppConstants` helpers
- `AppConstants.SanitizeName()` filters to `[a-z0-9_-]`
## Testing Requirements
- Integration tests **require** Testcontainers with real PostgreSQL 16 — no SQLite substitutions
- Unit tests required for: evidence hashing, AI context assembly, pattern detection, abbreviation uniqueness, Stripe idempotency
## Pitfalls
- **SSH host state**: `SshDockerCliService.SetHost()` must be called before each host operation in loops
- **Bitwarden cache**: Call `FlushCacheAsync()` after creating secrets before reading them back
- **No saga/rollback**: Partial failures across Git → MySQL → Docker → Xibo leave orphaned resources; cleanup is manual
- **Docker volumes are sticky**: Failed deploys leave volumes with old NFS driver options
- **Template CIFS→NFS compat**: Old `{{CIFS_*}}` tokens still render correctly as NFS equivalents
## Code Generation Checklist
- After generating a class implementing an interface, verify all members are implemented
- After generating a pipeline, verify all steps produce `JobStep` entities with progress broadcast via `IHubContext<FleetHub>`
- Do not stub steps as TODO — implement fully or flag explicitly