Hostityourself/docs/plan.md
Claude 48b9ccf152
feat: M4 Hardening — encryption, resource limits, monitoring, backups
## Env var encryption at rest (AES-256-GCM)
- server/src/crypto.rs: new module — encrypt/decrypt with AES-256-GCM
  Key = SHA-256(HIY_SECRET_KEY); non-prefixed values pass through
  transparently for zero-downtime migration
- Cargo.toml: aes-gcm = "0.10"
- routes/envvars.rs: encrypt on SET; list returns masked values (••••)
- routes/databases.rs: pg_password and DATABASE_URL stored encrypted
- routes/ui.rs: decrypt pg_password when rendering DB card
- builder.rs: decrypt env vars when writing the .env file for containers
- .env.example: add HIY_SECRET_KEY entry

## Per-app resource limits
- apps table: memory_limit (default 512m) + cpu_limit (default 0.5)
  added via idempotent ALTER TABLE in db.rs migration
- models.rs: App, CreateApp, UpdateApp gain memory_limit + cpu_limit
- routes/apps.rs: persist limits on create, update via PUT
- builder.rs: pass MEMORY_LIMIT + CPU_LIMIT to build script
- builder/build.sh: use $MEMORY_LIMIT / $CPU_LIMIT in podman run
  (replaces hardcoded --cpus="0.5"; --memory now also set)

## Monitoring (opt-in compose profile)
- infra/docker-compose.yml: gatus + netdata under `monitoring` profile
  Enable: podman compose --profile monitoring up -d
  Gatus on :8080, Netdata on :19999
- infra/gatus.yml: Gatus config checking HIY /api/status every minute

## Backup cron job
- infra/backup.sh: dumps SQLite, copies env files + git repos into a
  dated .tar.gz; optional rclone upload; 30-day local retention
  Suggested cron: 0 3 * * * /path/to/infra/backup.sh

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 15:06:42 +00:00

301 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HostItYourself — Heroku Clone MVP Plan
A self-hosted PaaS running on a Raspberry Pi that auto-deploys apps from GitHub pushes.
---
## Goals
- Single operator, multiple apps (target: 515 small services)
- `git push` → live in under 2 minutes
- Subdomain routing with automatic HTTPS
- Central dashboard for logs, env vars, status
- Low idle resource footprint (Raspberry Pi 4, 48 GB RAM)
---
## Architecture Overview
```
Internet
Cloudflare DNS
┌──────▼──────┐
│ Caddy │ ← Reverse proxy + auto TLS
│ (port 443) │
└──────┬──────┘
│ routes by subdomain
┌──────────────┼──────────────┐
│ │ │
┌─────▼──┐ ┌─────▼──┐ ┌─────▼──┐
│ App A │ │ App B │ │Control │
│(Docker)│ │(Docker)│ │ Plane │
└────────┘ └────────┘ └───┬────┘
┌────────▼────────┐
│ Build Engine │
│ (clone→build→ │
│ run container) │
└────────┬────────┘
┌────────▼────────┐
│ GitHub Webhook │
│ Listener │
└─────────────────┘
```
---
## Components
### 1. Reverse Proxy — Caddy
- Automatic TLS via Let's Encrypt (ACME DNS challenge through Cloudflare)
- Dynamic config: each deployed app gets a `<appname>.yourdomain.com` subdomain
- Caddy reloads config via API on each deploy; no restart needed
- Runs as a systemd service, not inside Docker (avoids bootstrapping issues)
### 2. Control Plane — `hiy-server`
A small Go or Node.js HTTP API + web UI. Responsibilities:
- CRUD for apps (name, GitHub repo URL, branch, env vars, port)
- Trigger manual deploys
- Proxy log streams from running containers
- Store state in a local SQLite database (`~/.hiy/db.sqlite`)
- Exposes REST API consumed by the dashboard and CLI
Dashboard pages:
- App list with status badges (running / building / stopped / crashed)
- Per-app: logs, env vars editor, deploy history, resource sparklines
- System overview: CPU, RAM, disk on the Pi
### 3. GitHub Webhook Listener
- Receives `push` events from GitHub (configured per-repo in GitHub settings)
- Validates HMAC signature (`X-Hub-Signature-256`)
- Filters on configured branch (default: `main`)
- Enqueues a deploy job; responds 200 immediately
- Can also be triggered via the dashboard or CLI
### 4. Build Engine — `hiy-builder`
Sequential steps run in a build container:
```
1. git clone --depth=1 <repo> /build/<appname>/<sha>
2. Detect build strategy:
a. Dockerfile present → docker build
b. Procfile + recognised language → buildpack (Cloud Native Buildpacks via `pack`)
c. static/ directory → Caddy file server image
3. docker build -t hiy/<appname>:<sha> .
4. docker stop hiy-<appname> (if running)
5. docker run -d --name hiy-<appname> \
--env-file /etc/hiy/<appname>.env \
--restart unless-stopped \
--network hiy-net \
hiy/<appname>:<sha>
6. Update Caddy upstream to new container
7. Prune old images (keep last 2)
```
Build logs are streamed to the control plane and stored per-deploy.
### 5. App Runtime — Docker
- Each app runs in its own container on the `hiy-net` bridge network
- Containers are **not** port-exposed to host; only Caddy reaches them
- Resource limits set at start: default 512 MB RAM, 0.5 CPU (configurable per app)
- `--restart unless-stopped` handles crash recovery
- Persistent data: apps that need storage mount named volumes (`hiy-<appname>-data`)
### 6. CLI — `hiy`
Thin shell script or Go binary for operator convenience:
```bash
hiy apps # list apps
hiy create myapp --repo <url> # register new app
hiy deploy myapp # trigger manual deploy
hiy logs myapp -f # tail logs
hiy env:set myapp KEY=value # set env var
hiy env:get myapp # list env vars
hiy restart myapp
hiy destroy myapp
```
---
## Data Model (SQLite)
```sql
apps (
id TEXT PRIMARY KEY, -- slug e.g. "my-api"
repo_url TEXT NOT NULL,
branch TEXT DEFAULT 'main',
port INTEGER NOT NULL, -- internal container port
created_at DATETIME,
updated_at DATETIME
)
deploys (
id TEXT PRIMARY KEY, -- uuid
app_id TEXT REFERENCES apps(id),
sha TEXT, -- git commit sha
status TEXT, -- queued|building|success|failed
log TEXT, -- full build log
triggered_by TEXT, -- webhook|manual|cli
started_at DATETIME,
finished_at DATETIME
)
env_vars (
app_id TEXT REFERENCES apps(id),
key TEXT,
value TEXT, -- encrypted at rest (age or libsodium)
PRIMARY KEY (app_id, key)
)
```
---
## Generic Infrastructure (Non-Functional Requirements)
### Security
| Layer | Mechanism |
|---|---|
| SSH access | Key-only, disable password auth, non-standard port |
| Firewall | `ufw`: allow only 22, 80, 443 inbound |
| Fail2ban | Bans IPs with repeated SSH/HTTP failures |
| Webhook auth | HMAC-SHA256 signature verification |
| Env var encryption | Encrypted at rest with `age`; decrypted into container env at deploy time |
| Container isolation | No `--privileged`, no host network, drop capabilities |
| Dashboard auth | HTTP Basic Auth behind Caddy (or simple JWT session) |
| Secrets never in logs | Build logs redact env var values |
| OS updates | `unattended-upgrades` for security patches |
### Monitoring
- **Metrics**: [Netdata](https://github.com/netdata/netdata) (single binary, ~50 MB RAM) — CPU, RAM, disk, network, per-container stats
- **Uptime checks**: [Gatus](https://github.com/TwiN/gatus) — HTTP health checks per app, alerts via email/Telegram
- **Alerting thresholds**: disk > 80%, RAM > 85%, any app down > 2 min
- **Dashboard link**: Netdata and Gatus UIs accessible via subdomains on the Pi
### Logging
- Container stdout/stderr captured by Docker logging driver
- `hiy logs <app>` tails `docker logs`
- Control plane stores last 10,000 lines per app in SQLite (ring buffer)
- Optional: ship to a free Loki/Grafana Cloud tier for persistence
### Backups
Daily cron job:
1. `sqlite3 ~/.hiy/db.sqlite .dump > backup.sql`
2. Tar all env var files from `/etc/hiy/`
3. Copy to an attached USB drive or remote (rclone to S3/Backblaze B2)
4. Retain 30 days of backups
### Deployment of the Platform Itself
The platform (Caddy, hiy-server, hiy-builder) is managed via a single `docker-compose.yml` at `~/hiy-platform/`. Upgrade process:
```bash
git pull origin main
docker compose up -d --build
```
Caddy runs as a systemd service outside Compose (avoids chicken-and-egg on port 443).
---
## Repository Layout
```
hostityourself/
├── plan.md ← this file
├── server/ ← control plane API + dashboard
│ ├── main.go (or index.ts)
│ ├── db/
│ ├── routes/
│ └── ui/ ← simple HTML/HTMX dashboard
├── builder/ ← build engine scripts
│ ├── build.sh
│ └── detect-strategy.sh
├── cli/ ← hiy CLI tool
│ └── hiy.sh
├── proxy/
│ └── Caddyfile.template
├── infra/
│ ├── docker-compose.yml ← platform compose file
│ ├── ufw-setup.sh
│ ├── fail2ban/
│ └── backup.sh
└── docs/
├── setup.md ← Pi OS bootstrap guide
└── add-app.md
```
---
## MVP Milestones
### M1 — Foundation (Week 12)
- [ ] Pi OS setup: Raspberry Pi OS Lite, Docker, Caddy, ufw, fail2ban
- [ ] `hiy-server` skeleton: SQLite, REST endpoints for apps + deploys
- [ ] Manual deploy via CLI: clone → build → run container → Caddy config reload
### M2 — Auto-Deploy (Week 3)
- [ ] GitHub webhook listener integrated into server
- [ ] HMAC validation
- [ ] Build queue (single worker, sequential builds)
- [ ] Build log streaming to dashboard
### M3 — Dashboard (Week 4)
- [ ] App list + status
- [ ] Log viewer (last 500 lines, live tail via SSE)
- [ ] Env var editor
- [ ] Deploy history
### M4 — Hardening (Week 5)
- [x] Env var encryption at rest (AES-256-GCM via `HIY_SECRET_KEY`; transparent plaintext passthrough for migration)
- [x] Resource limits on containers (per-app `memory_limit` + `cpu_limit`; defaults 512m / 0.5 CPU)
- [x] Netdata + Gatus setup (`monitoring` compose profile; `infra/gatus.yml`)
- [x] Backup cron job (`infra/backup.sh` — SQLite dump + env files + git repos; local + rclone remote)
- [x] Dashboard auth (multi-user sessions, bcrypt, API keys — done in earlier milestone)
### M5 — Polish (Week 6)
- [ ] Buildpack detection (Dockerfile / Node / Python / static)
- [ ] `hiy` CLI binary
- [ ] `docs/setup.md` end-to-end bootstrap guide
- [ ] Smoke test: deploy a real app end-to-end from a GitHub push
---
## Hardware Recommendation
| Component | Recommendation |
|---|---|
| Pi model | Raspberry Pi 4 or 5, 4 GB RAM minimum, 8 GB preferred |
| Storage | 64 GB+ A2-rated microSD **or** USB SSD (much faster builds) |
| Network | Wired Ethernet; configure static IP or DHCP reservation |
| DNS | Cloudflare — free, supports ACME DNS challenge for wildcard TLS |
| Domain | Any registrar; point `*.yourdomain.com` → home IP + Cloudflare proxy |
| Power | Official Pi USB-C PSU + UPS hat (PiJuice or Geekworm X120) |
---
## Key Design Decisions & Trade-offs
| Decision | Rationale |
|---|---|
| Docker over bare processes | Isolation, restart policy, easy image rollback |
| Caddy over Nginx/Traefik | Automatic HTTPS, simple API-driven config, single binary |
| SQLite over Postgres | Zero-ops, sufficient for one operator and dozens of apps |
| Sequential build queue | Avoids saturating the Pi CPU/RAM; builds rarely overlap for one person |
| Buildpacks as optional | Most personal projects have a Dockerfile; buildpacks add complexity for MVP |
| No Kubernetes | Massive overkill for a single Pi and one person |
| Age encryption for env vars | Simple, modern, no daemon needed vs. Vault |