Commit graph

50 commits

Author SHA1 Message Date
Claude
48b9ccf152
feat: M4 Hardening — encryption, resource limits, monitoring, backups
## Env var encryption at rest (AES-256-GCM)
- server/src/crypto.rs: new module — encrypt/decrypt with AES-256-GCM
  Key = SHA-256(HIY_SECRET_KEY); non-prefixed values pass through
  transparently for zero-downtime migration
- Cargo.toml: aes-gcm = "0.10"
- routes/envvars.rs: encrypt on SET; list returns masked values (••••)
- routes/databases.rs: pg_password and DATABASE_URL stored encrypted
- routes/ui.rs: decrypt pg_password when rendering DB card
- builder.rs: decrypt env vars when writing the .env file for containers
- .env.example: add HIY_SECRET_KEY entry

## Per-app resource limits
- apps table: memory_limit (default 512m) + cpu_limit (default 0.5)
  added via idempotent ALTER TABLE in db.rs migration
- models.rs: App, CreateApp, UpdateApp gain memory_limit + cpu_limit
- routes/apps.rs: persist limits on create, update via PUT
- builder.rs: pass MEMORY_LIMIT + CPU_LIMIT to build script
- builder/build.sh: use $MEMORY_LIMIT / $CPU_LIMIT in podman run
  (replaces hardcoded --cpus="0.5"; --memory now also set)

## Monitoring (opt-in compose profile)
- infra/docker-compose.yml: gatus + netdata under `monitoring` profile
  Enable: podman compose --profile monitoring up -d
  Gatus on :8080, Netdata on :19999
- infra/gatus.yml: Gatus config checking HIY /api/status every minute

## Backup cron job
- infra/backup.sh: dumps SQLite, copies env files + git repos into a
  dated .tar.gz; optional rclone upload; 30-day local retention
  Suggested cron: 0 3 * * * /path/to/infra/backup.sh

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 15:06:42 +00:00
Shautvast
2654a26b06 rust 1.94 2026-03-24 14:27:12 +01:00
Claude
f4aa6972e1
feat: shared Postgres with per-app schemas
One Postgres 16 instance runs in the infra stack (docker-compose).
Each app can be given its own isolated schema with a dedicated,
scoped Postgres user via the new Database card on the app detail page.

What was added:

infra/
  docker-compose.yml  — postgres:16-alpine service + hiy-pg-data
                        volume; POSTGRES_URL injected into server
  .env.example        — POSTGRES_PASSWORD entry

server/
  Cargo.toml          — sqlx postgres feature
  src/db.rs           — databases table (SQLite) migration
  src/models.rs       — Database model
  src/main.rs         — PgPool (lazy) added to AppState;
                        /api/apps/:id/database routes registered
  src/routes/mod.rs   — databases module
  src/routes/databases.rs — GET / POST / DELETE handlers:
      provision  — creates schema + scoped PG user, sets search_path,
                   injects DATABASE_URL env var
      deprovision — DROP OWNED BY + DROP ROLE + DROP SCHEMA CASCADE,
                   removes SQLite record
  src/routes/ui.rs    — app_detail queries databases table, renders
                        db_card based on provisioning state
  templates/app_detail.html — {{db_card}} placeholder +
                              provisionDb / deprovisionDb JS

Apps connect via:
  postgres://hiy-<app>:<pw>@postgres:5432/hiy
search_path is set on the role so no URL option is needed.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 13:16:39 +00:00
Claude
8267b30b15
fix: restore app reachability after platform restart
Two root causes:

1. Caddy was started without --resume, so every restart wiped all
   dynamically-registered app routes (only the base Caddyfile survived).
   Adding --resume makes Caddy reload its auto-saved config (stored in
   the caddy-config volume) which includes all app routes.

2. App routes used the container IP address, which changes whenever
   hiy-net is torn down and recreated by compose. Switch to the
   container name as the upstream dial address; Podman's aardvark-dns
   resolves it by name within hiy-net, so it stays valid across
   network recreations.

Together with the existing reconnect loop in start.sh these two
changes mean deployed apps survive a platform restart without needing
a redeploy.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:47:32 +00:00
Claude
a8e73df2c3
fix: reconnect app containers to hiy-net after platform restart
compose down destroys hiy-net and evicts running hiy-* containers
from it. compose up recreates the network but leaves those containers
disconnected, making them unreachable until a redeploy.

After compose up, reconnect all running hiy-* containers to hiy-net.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:42:26 +00:00
Claude
31944d128b
remove: unnecessary app restart loop from start.sh
Without podman system migrate, compose down/up only touches infra
containers. Deployed hiy-* containers are never stopped during a
platform restart so they need no special handling there.

The restart loop stays in boot.sh where it is needed (system reboot
stops all containers).

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:36:03 +00:00
Claude
9fbbdf62ee
remove: podman system migrate (wrong tool for the wrong problem)
It was added to "pick up subuid/subgid mappings" but that's not what it
does — it migrates container storage after a Podman version upgrade.
Subuid/subgid changes are picked up by restarting the Podman socket,
which the script already does. The only effect of running it was stopping
all containers on every platform start.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:34:16 +00:00
Claude
852e3f6ccb
fix: restart deployed app containers after platform start
podman system migrate explicitly stops all containers, which overrides
the --restart unless-stopped policy set on deployed apps. After compose
up-d brings the infra stack back, any exited hiy-* container is now
restarted automatically.

Same logic added to boot.sh for the on-boot path.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:32:45 +00:00
Claude
88f6e02d4e
feat: auto-restart stack on boot via systemd user service
- Add infra/boot.sh: lightweight startup (no build) that brings up the
  Podman stack — used by the systemd unit on every system boot
- start.sh now installs/refreshes hiy.service (a systemd --user unit)
  and enables loginctl linger so it runs without an active login session

After the next `infra/start.sh` run the Pi will automatically restart
the stack after a reboot or power cut.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 12:22:34 +00:00
Claude
031c3bdd41
fix: defer podman system migrate to after the build to eliminate early downtime
podman system migrate was stopping all containers immediately (visible in
the terminal output as "stopped <id>" lines), before the build even began.

Moving it to just before compose down/up means running containers stay
alive for the entire duration of the image build.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 10:48:45 +00:00
Claude
a16ccdcef4
fix: build images before tearing down compose to reduce downtime
Old behaviour: compose down → long build → compose up
New behaviour: long build (service stays live) → compose down → compose up

Downtime is now limited to the few seconds of the swap instead of the
entire duration of the Rust/image build.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-24 10:43:36 +00:00
Claude
e7fd2a4365
fix: auto-enable cgroup swap accounting on Pi before starting containers
runc (used by Podman) always writes memory.swap.max when initializing the
cgroup v2 memory controller, even without explicit --memory flags. On
Raspberry Pi OS this file is absent because swap accounting is disabled
by default in the kernel, causing every container start to fail with:

  openat2 …/memory.swap.max: no such file or directory

start.sh now detects this condition early, patches the kernel cmdline
(cgroup_enable=memory cgroup_memory=1 swapaccount=1) in either
/boot/firmware/cmdline.txt (Pi OS Bookworm) or /boot/cmdline.txt
(older releases), and tells the user to reboot once before continuing.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 18:05:11 +00:00
Claude
2fdffc0acb
Fix builds delegating to host Podman via CONTAINER_HOST
build.sh calls `podman build` inside the server container.
DOCKER_HOST is a Docker CLI variable; Podman does not use it to
automatically switch to remote mode.  Without CONTAINER_HOST set,
Podman runs locally inside the (unprivileged) container, has no
user-namespace support, and lchown fails for any layer file owned
by a non-zero GID (e.g. gid=42 for /etc/shadow).

Setting CONTAINER_HOST=tcp://podman-proxy:2375 makes Podman
automatically operate in remote mode and delegate all operations
to the host Podman service, which has the correct subuid/subgid
mappings and full user-namespace support.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 10:50:41 +00:00
Claude
b5e6c8fcd3
Fix rootless Podman lchown EINVAL by ensuring uidmap and fresh service
Two root causes for "invalid argument" when chowning non-root UIDs/GIDs
in image layers:

1. Missing uidmap package: without setuid newuidmap/newgidmap binaries,
   Podman can only map a single UID (0 → current user) in the user
   namespace.  Any layer file owned by gid=42 (shadow) or similar then
   has no mapping and lchown returns EINVAL.  Now install uidmap if absent.

2. Stale Podman service: a service started before subuid/subgid entries
   existed silently keeps the single-UID mapping for its lifetime even
   after the entries are added and podman system migrate is run.  Now
   always kill and restart the service on each start.sh run so it always
   reads the current subuid/subgid configuration.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 10:32:13 +00:00
Claude
b64195c58a
Always run podman system migrate, not only when subuid/subgid entries are added
If entries already existed before this script first ran, _HIY_SUBID_CHANGED
stayed 0 and migrate was skipped, leaving Podman storage out of sync with
the namespace mappings and causing lchown errors on layer extraction.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 10:25:25 +00:00
Claude
4f5c2e8432
Add subuid/subgid entries for rootless Podman user namespace mapping
Without entries in /etc/subuid and /etc/subgid, Podman cannot map the
UIDs/GIDs present in image layers (e.g. gid 42 for /etc/shadow) into
the user namespace, causing 'lchown: invalid argument' on layer extraction.

Add a 65536-ID range starting at 100000 for the current user if missing,
then run podman system migrate so existing storage is updated.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 10:19:21 +00:00
Claude
dae5fd3b53
Allow rootless Podman to bind ports 80 and 443
Rootless processes cannot bind privileged ports (<1024) by default.
Lower net.ipv4.ip_unprivileged_port_start to 80 at startup, and persist
it to /etc/sysctl.conf so the setting survives reboots.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 10:11:21 +00:00
Claude
d2cba788ab
Fix rootless Podman by owning /run/user/<uid> instead of redirecting to /tmp
Podman rootless unconditionally resets XDG_RUNTIME_DIR to /run/user/<uid>
if that directory exists, overriding any env var we set. Redirecting to
/tmp is therefore ineffective.

Instead, ensure /run/user/<uid> exists and is owned by the current user
(using sudo if needed), mirroring what PAM/logind does for login sessions.
All Podman runtime state (socket, events, netavark) then works correctly.

Remove the now-unnecessary storage.conf/containers.conf writes and the
inline env override on podman system service.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 08:02:10 +00:00
Claude
0932308ed6
Fix make and podman compose to use correct paths when run from repo root
make build was looking for Makefile in cwd (repo root) instead of infra/.
Use -C "$SCRIPT_DIR" so it always finds infra/Makefile regardless of where
the script is invoked from.

Add -f flag to podman compose up so it finds infra/docker-compose.yml
from any working directory.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:55:58 +00:00
Claude
ea5b6e5594
Write containers.conf tmp_dir and force env var inline on podman call
Podman's events engine reads tmp_dir from containers.conf, not from
XDG_RUNTIME_DIR directly. Write both storage.conf and containers.conf
to /tmp/podman-<uid> so no path under /run/user/<uid> is ever used.
Also use `env XDG_RUNTIME_DIR=...` prefix on podman invocation to
override any stale value in the calling shell environment.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:49:00 +00:00
Claude
0690e3c48a
Unconditionally redirect Podman runtime to /tmp; override storage.conf
Stop relying on conditional checks. Always point XDG_RUNTIME_DIR and
storage.conf runroot to /tmp/podman-<uid> so Podman never touches
/run/user/<uid>, which requires PAM/logind to create.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:42:54 +00:00
Claude
cf50332a8f
Check XDG_RUNTIME_DIR is writable, not just set
SSH sessions may export XDG_RUNTIME_DIR=/run/user/<uid> even when that
directory doesn't exist or isn't writable. Check writability rather than
emptiness before falling back to /tmp/podman-<uid>.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:40:53 +00:00
Claude
139a03c774
Set XDG_RUNTIME_DIR before any podman call in non-login shells
Podman uses XDG_RUNTIME_DIR for its RunRoot, events dirs, and default
socket path. Without it pointing to a writable location, podman fails
with 'mkdir /run/user/<uid>: permission denied' even before the socket
is created. Export it to /tmp/podman-<uid> when unset.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:39:34 +00:00
Claude
26701675f2
Use XDG_RUNTIME_DIR or /tmp fallback for Podman socket dir
/run/user/<uid> is created by PAM/logind and doesn't exist in non-login
shells. Fall back to /tmp/podman-<uid> when XDG_RUNTIME_DIR is unset so
mkdir always succeeds.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:38:40 +00:00
Claude
5359c43cb8
Replace systemctl --user with podman system service for socket activation
systemctl --user fails in non-interactive shells (no D-Bus session bus).
podman system service starts the socket directly without systemd/D-Bus,
backgrounding the process and waiting up to 5 s for the socket to appear.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-22 07:37:02 +00:00
Claude
06ababa7c6
Fix Podman socket for rootless setup on Raspberry Pi
start.sh now activates the Podman user socket via systemctl --user if it
isn't running yet, then exports DOCKER_HOST and PODMAN_SOCK so that
podman compose (which delegates to the docker-compose plugin) can connect.

docker-compose.yml mounts ${PODMAN_SOCK} into the socat proxy container
at a fixed internal path (/podman.sock), so it works for both rootful
(/run/podman/podman.sock) and rootless (/run/user/<UID>/podman/podman.sock)
without hardcoding the UID.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-21 18:08:10 +00:00
Claude
dd107aacdb
Fix start.sh: docker compose → podman compose
Missed in the previous Podman migration commit.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-21 18:03:41 +00:00
Claude
4319b99102
Replace Docker with Podman throughout
- builder/build.sh: all docker commands → podman (build, run, stop, rm,
  network create, images, rmi, inspect)
- server/src/routes/apps.rs: docker stop/restart → podman
- server/src/routes/ui.rs: docker inspect → podman
- infra/Dockerfile.server: install podman instead of docker.io
- infra/docker-compose.yml: rename docker-proxy → podman-proxy, mount
  /run/podman/podman.sock (rootful Podman socket), update DOCKER_HOST
- infra/Makefile: docker compose → podman compose

Podman is daemonless and rootless by default; OCI images are identical so
no build-pipeline changes are needed beyond renaming the CLI.

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-20 14:58:52 +00:00
Claude
4454744cba
Add session-based auth to dashboard and API
- New HIY_ADMIN_USER / HIY_ADMIN_PASS env vars control access
- Login page at /login with redirect-after-login support
- Cookie-based sessions (HttpOnly, SameSite=Strict); cleared on restart
- Auth middleware applied to all routes except /webhook/:app_id (HMAC) and /login
- Auth is skipped when credentials are not configured (dev mode, warns at startup)
- Logout link in both dashboard nav bars
- Caddy admin port 2019 no longer published to the host in docker-compose

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-20 13:45:16 +00:00
Claude
b9171d2504
Fix env_file path: .env is in project root, not infra/ 2026-03-20 13:06:29 +00:00
Claude
44c1bf03b4
Load .env directly via env_file so DOMAIN_SUFFIX reaches containers
Using compose-level ${DOMAIN_SUFFIX} substitution only works when docker
compose is run from the same directory as the .env file. env_file loads
the file relative to the compose file, so it works regardless of CWD.
2026-03-20 12:55:12 +00:00
Claude
a9490da8a8
Fix Caddy startup: remove empty ACME_EMAIL that caused parse error
Caddy's email directive requires a non-empty argument. Since ACME_EMAIL
wasn't set, Caddy failed to parse the config. Email is optional for
Let's Encrypt — remove the directive entirely and document it as a
manual opt-in comment.
2026-03-20 12:49:39 +00:00
Claude
dc59293c5e
Replace Cloudflare DNS challenge with standard Let's Encrypt HTTP-01
Caddy's built-in ACME support handles TLS automatically — no CF_API_TOKEN,
no Cloudflare account, no DNS plugin needed. Requires ports 80+443 forwarded
to the Pi and ACME_EMAIL set in infra/.env.
2026-03-20 11:41:40 +00:00
Claude
3794f4cf36
Fix Dockerfile heredoc parse error in RUN if block
Use printf instead of heredoc for cargo config — heredoc inside a
conditional RUN block confuses Docker's parser (fi becomes an unknown
instruction). The config is always written; unused linker entries are
harmless on native builds.
2026-03-20 10:42:46 +00:00
Claude
3096d251c6
Fix Dockerfile: skip cross-compilers when building natively
gcc-aarch64-linux-gnu is an x86→arm64 cross-compiler; it doesn't exist
on arm64 hosts (like the Pi). Only install cross-toolchains and set cargo
linker config when BUILDPLATFORM != TARGETPLATFORM.
2026-03-20 10:40:12 +00:00
Claude
2060606adc
Consolidate to single .env at repo root
Add ACME_EMAIL to root .env.example.
start.sh now reads root .env and passes it to docker compose.
Removed infra/.env.example.
2026-03-20 10:21:35 +00:00
Claude
d5a5875899
Add TLS setup to start.sh; drop Cloudflare requirement
start.sh now generates proxy/caddy.json at launch time with Let's Encrypt
automatic HTTPS (HTTP-01 or TLS-ALPN-01 challenge — no Cloudflare needed).

Reads DOMAIN_SUFFIX and ACME_EMAIL from infra/.env before starting.
Added infra/.env.example to document required vars.
2026-03-20 10:18:01 +00:00
Claude
b060ec68af
Add start.sh and Makefile build-only targets
start.sh builds via 'make build' (platform auto-detected) then starts
services detached with 'docker compose up -d'.

Makefile gains build/build-<platform> targets that build images without
starting, mirroring the existing up/<platform> targets.
2026-03-20 10:06:24 +00:00
Claude
00da63ec80
Auto-detect platform by default; use DOCKER_DEFAULT_PLATFORM for cross-compile targets
Remove hardcoded platform from compose file so plain 'make up' (or
'docker compose up --build') always builds natively for the host.
Explicit targets (up-arm64, up-armv7, etc.) set DOCKER_DEFAULT_PLATFORM.
2026-03-20 10:03:36 +00:00
Claude
0fecb9a4fe
Add up-win alias (Windows Docker Desktop uses linux/amd64 via WSL2) 2026-03-20 10:02:21 +00:00
Claude
5484b29af6
Add up-x64 alias for up-amd64 in Makefile 2026-03-20 10:01:26 +00:00
Claude
588e74a626
Multi-platform Docker build: amd64, arm64, armv7, armv6
Dockerfile now uses BuildKit TARGETARCH/TARGETVARIANT to pick the Rust
cross-compilation target automatically. The build stage always runs on
the host platform for speed.

Makefile provides named targets:
  make up-amd64   # Mac Intel / Linux desktop
  make up-arm64   # Mac M1/M2/M3, Pi 4/5 (64-bit OS)
  make up-armv7   # Pi 2/3/4 (32-bit OS)
  make up-armv6   # Pi Zero / Pi 1
2026-03-20 09:55:53 +00:00
Shautvast
f92545ed4e armv7 target for my old pi 2026-03-19 15:55:43 +01:00
Shautvast
fd7d417471 latest rust slim 2026-03-19 12:33:08 +01:00
Claude
2df3c579e4
fix: switch Docker access to TCP via socat proxy; add Caddy error logging
- Add docker-proxy (alpine/socat) sidecar that exposes the Docker Unix
  socket as TCP on port 2375, so server needs no privileged socket mount
- Set DOCKER_HOST=tcp://docker-proxy:2375 in server environment
- App containers are still spawned on the host daemon and join hiy-net,
  so Caddy can still reach them
- Log actual Caddy PUT response body and HTTP status on failure
  instead of a silent warning
2026-03-19 11:24:50 +00:00
Claude
a8b22d8e2d
fix: switch to Caddy JSON config so dynamic routes work correctly
The Caddyfile created a server with an auto-generated name, not 'hiy',
so build.sh's PUT to /config/apps/http/servers/hiy/routes was creating
a parallel server that never received traffic.

- Replace Caddyfile with caddy.json that names the server 'hiy' with
  the dashboard as a catch-all fallback route
- Insert app routes at index 0 so host-matched routes are evaluated
  before the catch-all dashboard fallback
- Update docker-compose to mount caddy.json and pass --config flag
2026-03-19 11:02:57 +00:00
Claude
bddc1a8027
fix: use musl static linking to eliminate glibc version dependency
Build hiy-server targeting aarch64-unknown-linux-musl so the binary
has no glibc dependency at all, making the runtime image irrelevant
to glibc version mismatches. Uses rustls (already in Cargo.toml) so
no OpenSSL vendoring needed. SQLite is bundled by sqlx.
2026-03-19 10:48:46 +00:00
Claude
f6e6d1f8a3
fix: upgrade builder to rust:1.94-slim-bookworm 2026-03-19 10:46:37 +00:00
Claude
4e8aa1614e
fix: pin builder to rust:1.77-slim-bookworm to match runtime glibc
rust:1.77-slim has drifted to a newer Debian base with glibc 2.39,
but debian:bookworm-slim only has glibc 2.36, causing a GLIBC_2.39
not found error at runtime. Pinning to the explicit bookworm variant
keeps both stages on the same glibc version.
2026-03-19 10:45:51 +00:00
Claude
8f5bb158cb
M1: Rust control plane, builder, dashboard, and infra
- Cargo workspace with hiy-server (axum 0.7 + sqlx SQLite + tokio)
- SQLite schema: apps, deploys, env_vars (inline migrations, no daemon)
- Background build worker: sequential queue, streams stdout/stderr to DB
- REST API: CRUD for apps, deploys, env vars; GitHub webhook with HMAC-SHA256
- SSE endpoint for live build log streaming
- Monospace HTMX-free dashboard: app list + per-app detail, log viewer, env editor
- builder/build.sh: clone/pull → detect strategy (Dockerfile/buildpack/static)
  → docker build → swap container → update Caddy via admin API → prune images
- infra/docker-compose.yml + Dockerfile.server for local dev (no Pi needed)
- proxy/Caddyfile: auto-HTTPS off for local, comment removed for production
- .env.example

Compiles clean (zero warnings). Run locally:
  cp .env.example .env && cargo run --bin hiy-server

https://claude.ai/code/session_01FKCW3FDjNFj6jve4niMFXH
2026-03-19 08:25:59 +00:00