docs: log full 01-06 validation and hardening outcomes
This commit is contained in:
parent
2ef3bae7cd
commit
17525c88a5
2 changed files with 277 additions and 0 deletions
69
roadmap.md
Normal file
69
roadmap.md
Normal file
|
|
@ -0,0 +1,69 @@
|
|||
# Workshop Roadmap
|
||||
|
||||
## Exercise Map
|
||||
|
||||
| # | Exercise | Type | Est. Time | Status |
|
||||
|---|----------|------|-----------|--------|
|
||||
| 01 | Bootstrap ArgoCD | Core | 30 min | ✅ Implemented |
|
||||
| 02 | Deploy podinfo via GitOps | Core | 30 min | ✅ Implemented |
|
||||
| 03 | MetalLB + Ingress-Nginx (LAN exposure) | Core | 45 min | ✅ Implemented |
|
||||
| 04 | Tekton pipeline (image tag bump → GitOps loop) | Core | 45 min | ✅ Implemented |
|
||||
| 05 | App upgrade via GitOps | Core | 15 min | ✅ Implemented |
|
||||
| 06 | Monitoring: Prometheus + Grafana | Bonus | 60 min | ✅ Implemented |
|
||||
|
||||
**Total core: ~2.5–3h. Beginners may stop after Exercise 03 (~1h45m).**
|
||||
|
||||
---
|
||||
|
||||
## Solution Branches
|
||||
|
||||
Model: solution branches are **standalone per exercise** (not cumulative).
|
||||
|
||||
| Branch | State |
|
||||
|--------|-------|
|
||||
| `solution/01-argocd-bootstrap` | ArgoCD running, root app applied |
|
||||
| `solution/02-deploy-podinfo` | podinfo synced via ArgoCD |
|
||||
| `solution/03-metallb-ingress` | MetalLB + Ingress-Nginx + podinfo reachable on LAN; CRD `caBundle` drift handling included |
|
||||
| `solution/04-tekton-pipeline` | Full Tekton GitOps loop working |
|
||||
| `solution/05-app-upgrade` | deployment.yaml bumped to 6.7.0 |
|
||||
| `solution/06-monitoring` | Prometheus + Grafana running |
|
||||
|
||||
---
|
||||
|
||||
## Verification Status
|
||||
|
||||
| Exercise | Smoke-tested |
|
||||
|----------|-------------|
|
||||
| 01 | ✅ Validated (clean VM + bootstrap + root sync) |
|
||||
| 02 | ✅ Validated (podinfo app deploy + healthy) |
|
||||
| 03 | ✅ Validated (MetalLB + ingress + podinfo URL reachable) |
|
||||
| 04 | ✅ Validated after hardening fixes (PSA patch + pipeline runtime fixes) |
|
||||
| 05 | ✅ Validated (upgrade/drift workflow over working 04 stack) |
|
||||
| 06 | ✅ Validated (Prometheus/Grafana app healthy + Grafana ingress reachable) |
|
||||
|
||||
Full end-to-end test: completed on `ops-demo-tryout` from clean baseline through 01–06.
|
||||
|
||||
---
|
||||
|
||||
## Recent Changes (2026-03-01)
|
||||
|
||||
- End-to-end smoke test executed in clean tryout environment (`vagrant destroy && vagrant up`).
|
||||
- Exercise 04 hardening to make tutorial reproducible:
|
||||
- Tekton namespace PodSecurity patch (`pod-security.kubernetes.io/enforce=privileged`)
|
||||
- pipeline validate step switched to pure client-side `kubectl create --dry-run=client`
|
||||
- clone task now ensures workspace writeability for later task images (`chmod -R a+rwX .`)
|
||||
- git clone/push switched to HTTP auth header flow (no URL credential embedding)
|
||||
- Exercise 04 docs clarified with explicit PSA semantics and workshop trade-offs.
|
||||
- Assignment clarity improvements across docs/01..06:
|
||||
- every shell snippet clearly marked as `VM` or `HOST`
|
||||
- removed large per-page top callout blocks; context now lives at snippet level
|
||||
- Exercise 03 docs expanded with practical explanation around MetalLB manifests and key Kubernetes terms.
|
||||
- Exercise 04 docs expanded with:
|
||||
- explicit mandatory credential step before PipelineRun
|
||||
- clear distinction between Argo wrapper manifest vs full Tekton pipeline manifest
|
||||
- Tekton Dashboard + ingress walkthrough
|
||||
- `scripts/vm/set-git-credentials.sh` now prints a context-correct PipelineRun path (`/vagrant/...` fallback included).
|
||||
- Earlier branch-level fixes remain in place:
|
||||
- root recursive discovery
|
||||
- MetalLB CRD `caBundle` drift handling
|
||||
- Tekton empty `kustomize` drift fix in solution flow
|
||||
208
sessions.md
Normal file
208
sessions.md
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
# Sessions Log
|
||||
|
||||
Per-session progress notes. Newest entry first.
|
||||
|
||||
---
|
||||
|
||||
## 2026-03-01 — Full end-to-end validation + out-of-box hardening (SESSION 5)
|
||||
|
||||
**Session goal**: Run the complete workshop flow in a clean tryout repo/VM and close all blockers until 01→06 works out of the box.
|
||||
|
||||
**Validated flow**:
|
||||
- Fresh baseline in `ops-demo-tryout`:
|
||||
- force-reset to `upstream/main`
|
||||
- `vagrant destroy -f && vagrant up`
|
||||
- bootstrap + Argo repo registration + root commit
|
||||
- Progressive exercise validation completed through 01→06.
|
||||
- Final runtime state confirmed:
|
||||
- all Argo apps `Synced/Healthy`
|
||||
- podinfo image at `ghcr.io/stefanprodan/podinfo:6.7.0`
|
||||
- URLs responding: podinfo `200`, tekton dashboard `200`, grafana `302` (login redirect)
|
||||
|
||||
**Critical blockers found and fixed**:
|
||||
1. **Tekton TaskRuns rejected by Pod Security Admission**
|
||||
Symptom: `PodAdmissionFailed` in `tekton-pipelines` namespace.
|
||||
Fix:
|
||||
- `manifests/ci/tekton/kustomization.yaml` now patches existing Namespace
|
||||
- new `manifests/ci/tekton/namespace-podsecurity-patch.yaml`
|
||||
- docs/04 updated with explicit rationale (what PSA means and why this trade-off is used in workshop)
|
||||
|
||||
2. **Pipeline validate step required unintended RBAC**
|
||||
Symptom: `validate` task failed with `Forbidden` on reads in `podinfo` namespace.
|
||||
Fix:
|
||||
- switched validate command from `kubectl apply --dry-run=client` to
|
||||
`kubectl create --dry-run=client` (pure client-side validation)
|
||||
|
||||
3. **Workspace file ownership/mode mismatch between task images**
|
||||
Symptom: `bump-image-tag` failed with permission denied writing `deployment.yaml`.
|
||||
Fix:
|
||||
- clone task now runs `chmod -R a+rwX .` so subsequent task images/users can write.
|
||||
|
||||
4. **Git push URL credential embedding failed**
|
||||
Symptom: `git-commit-push` failed with URL parse error (`Port number was not a decimal number...`).
|
||||
Fix:
|
||||
- clone/push now use `http.extraHeader=Authorization: Basic ...`
|
||||
instead of embedding credentials in remote URL.
|
||||
|
||||
**Docs hardened**:
|
||||
- `docs/04-tekton-pipeline.md` on `main` expanded with practical explanations:
|
||||
- clear PSA meaning (`enforce=privileged` does **not** mean pods must be privileged)
|
||||
- why namespace patch is needed in this workshop
|
||||
- task-level explanation and stronger troubleshooting guidance
|
||||
- removed obsolete troubleshooting about `validate Forbidden` after validate-step fix.
|
||||
|
||||
**Branches updated**:
|
||||
- `main`:
|
||||
- `f7a54b6` docs(ex04): clarify PodSecurity patch meaning and rationale
|
||||
- `2ef3bae` docs(ex04): align validate explanation with client-side check
|
||||
- `solution/04-tekton-pipeline`:
|
||||
- `acf6be0` fix(ex04): patch Tekton namespace pod-security label
|
||||
- `09262dc` docs(ex04): clarify PodSecurity patch meaning and rationale
|
||||
- includes validated pipeline runtime fixes (validate mode, workspace perms, auth header clone/push)
|
||||
|
||||
**Notes**:
|
||||
- Tryout required repoURL substitutions to its fork URL where solution manifests referenced `ops-demo`.
|
||||
- No unresolved runtime blockers remained at end of session.
|
||||
|
||||
---
|
||||
|
||||
## 2026-03-01 — Assignment clarity pass + Tekton docs hardening (SESSION 4)
|
||||
|
||||
**Session goal**: Remove ambiguity in exercise instructions and align docs with real execution flow.
|
||||
|
||||
**Completed this session**:
|
||||
- Exercise 03 expanded with explanatory text around key manifests:
|
||||
- MetalLB speaker/tolerations explanation
|
||||
- IPAddressPool + L2Advertisement purpose
|
||||
- Argo app split and sync-wave reasoning
|
||||
- Ingress intent for podinfo and ArgoCD
|
||||
- Exercise 04 clarified and hardened:
|
||||
- Explicitly states `apps/ci/pipeline.yaml` is only an Argo wrapper
|
||||
- Makes `set-git-credentials.sh` a mandatory pre-step
|
||||
- Added Tekton Dashboard + ingress walkthrough in assignment text
|
||||
- Added troubleshooting for common Tekton/root drift
|
||||
- Command-context UX improved across assignments:
|
||||
- Shell snippets now clearly labeled `VM` or `HOST` in quote-style blocks
|
||||
- Removed oversized top callout blocks from exercise pages per user preference
|
||||
- `scripts/vm/set-git-credentials.sh` improved:
|
||||
- Next-step output now prints a usable PipelineRun manifest path (`manifests/...` or `/vagrant/...`)
|
||||
depending on where the script is run.
|
||||
|
||||
**Key commits pushed (main)**:
|
||||
- `83d227a` docs(ex04): document tekton kustomize drift fix
|
||||
- `a2c15d6` docs(ex04): add Tekton Dashboard UI + ingress walkthrough
|
||||
- `0212f4b` docs: clarify command context and workshop flow
|
||||
|
||||
**Open follow-up**:
|
||||
- If dashboard setup should be mandatory in `solution/04`, validate in tryout and backport explicitly to that branch.
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-28 — Workflow hardening + docs alignment (SESSION 3)
|
||||
|
||||
**Session goal**: Fix operator-facing workflow issues, prevent wrong-cluster mistakes, and align docs/solutions with real usage.
|
||||
|
||||
**Completed this session**:
|
||||
- Host/VM script split is now the working model in docs and flow:
|
||||
- host: `scripts/host/bootstrap-from-host.*`, `scripts/host/argocd-ui-tunnel.*`
|
||||
- vm: `scripts/vm/bootstrap.sh`, `scripts/vm/set-git-credentials.sh`, `scripts/vm/argocd-port-forward.sh`
|
||||
- Bootstrap safety improved:
|
||||
- cluster target checks enforced in `scripts/vm/bootstrap.sh`
|
||||
- recursive app discovery fix merged (`cc0d36b`)
|
||||
- README and exercise docs updated multiple times for:
|
||||
- `vagrant ssh` usage
|
||||
- Argo repo registration requirement
|
||||
- GitHub PAT guidance (fine-grained token path and permissions context)
|
||||
- host/VM execution clarity and troubleshooting
|
||||
- MetalLB OutOfSync drift investigated and fixed:
|
||||
- root cause: CRD webhook `caBundle` drift behavior in Argo comparison
|
||||
- validated against `pms15-cluster` behavior
|
||||
- `solution/03-metallb-ingress` updated to ignore CRD `caBundle` drift generically (not single CRD name)
|
||||
- docs/03 troubleshooting updated on main
|
||||
- Formatting pass landed for markdown readability (`fc0eb1b`), then targeted wording corrections.
|
||||
- `CLAUDE.md` refreshed to current architecture and branch model.
|
||||
|
||||
**Key commits pushed (main)**:
|
||||
- `c68292e` docs: clarify VM access via vagrant ssh only
|
||||
- `71c1f79` improve bootstrap safety + host-side Argo access scripts
|
||||
- `4d77c82` fix host KUBECONFIG leakage in host/vm scripts
|
||||
- `cb912cf` split host/vm scripts + Argo tunnel workflow fix
|
||||
- `d59818d` docs refinements (workshop flow + Argo repo credentials)
|
||||
- `cc0d36b` bootstrap: recursive app discovery in root app
|
||||
- `fc0eb1b` markdown formatting/readability
|
||||
- `0dc7062` ex03 docs: Metallb CRD drift troubleshooting
|
||||
|
||||
**Key commit pushed (solution branch)**:
|
||||
- `solution/03-metallb-ingress`: `2e6b4fb` (ignore Metallb CRD `caBundle` drift across CRDs)
|
||||
|
||||
**Notes / follow-up**:
|
||||
- Keep `sessions.md`/`roadmap.md` in sync after every significant change.
|
||||
- Verify all solution branches still obey "standalone per exercise" constraints before next content edits.
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-28 — Branching restructure + Dutch translation (SESSION 2, INCOMPLETE)
|
||||
|
||||
**Session goal**: Restructure branches, translate docs to Dutch, rebuild solution branches off thin main.
|
||||
|
||||
**Completed this session**:
|
||||
- `reference-solution` branch created from old main (full working solution) ✓
|
||||
- Solution files removed from `main` (staged, NOT committed) ✓
|
||||
- `scripts/bootstrap.sh` rewritten: Dutch, auto-detects fork URL (SSH→HTTPS), generates apps/root.yaml ✓
|
||||
- `README.md` rewritten in Dutch ✓
|
||||
- `docs/vm-setup.md` rewritten in Dutch ✓
|
||||
- `docs/01-argocd-bootstrap.md` through `docs/06-monitoring.md` rewritten in Dutch ✓
|
||||
- `docs/presentation/final-talk.md` — **STILL EMPTY** (1 line) — NOT YET DONE
|
||||
- Old solution/NN-* branches NOT yet deleted/recreated
|
||||
- **NOTHING COMMITTED on new main yet**
|
||||
|
||||
**Git status on `main`**:
|
||||
- STAGED: deletions of all solution files (apps/apps/, apps/ci/, etc., manifests/apps/, etc.)
|
||||
- UNSTAGED MODIFIED: README.md, all docs/*.md, scripts/bootstrap.sh
|
||||
- UNTRACKED: none relevant
|
||||
|
||||
**What to do next session**:
|
||||
1. Write `docs/presentation/final-talk.md` in Dutch (translate from reference-solution branch, natural dev-Dutch)
|
||||
2. `git add -A` + ONE commit on main (all deletions + Dutch docs + new bootstrap.sh)
|
||||
3. Delete old solution branches: solution/01 through solution/06
|
||||
4. Recreate solution/01-argocd-bootstrap through solution/06-monitoring cumulatively off new thin main, each with ONE commit
|
||||
5. Push everything to GitHub (paulharkink/ops-demo)
|
||||
6. Continue smoke-testing exercises 02–05
|
||||
|
||||
**Key Dutch translation rules** (user was very clear):
|
||||
- Natural dev-Dutch, written as if Paul wrote it
|
||||
- Technical terms stay English: "branches", "cluster", "pipeline", "deployment", "namespace", etc.
|
||||
- "takken" is NEVER acceptable
|
||||
- No Apple Silicon warnings
|
||||
- No "Co-Authored-By: Claude" in commits
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-28 — Initial implementation (SESSION 1)
|
||||
|
||||
**Session goal**: Full repo scaffold from implementation plan.
|
||||
|
||||
**Completed**:
|
||||
- Phase 1: CLAUDE.md, sessions.md, roadmap.md, Vagrantfile, scripts/bootstrap.sh,
|
||||
apps/root.yaml, apps/project.yaml, apps/argocd.yaml, manifests/argocd/values.yaml
|
||||
→ `solution/01-argocd-bootstrap` branch created
|
||||
- Phase 2: apps/apps/podinfo.yaml, manifests/apps/podinfo/, docs/01-argocd-bootstrap.md,
|
||||
docs/02-deploy-podinfo.md
|
||||
→ `solution/02-deploy-podinfo` branch created
|
||||
- Phase 3: MetalLB + Ingress-Nginx apps/manifests, podinfo ingress, ArgoCD ingress,
|
||||
docs/03-metallb-ingress.md
|
||||
→ `solution/03-metallb-ingress` branch created
|
||||
- Phase 4: Tekton app/manifests, pipeline resources, scripts/set-git-credentials.sh,
|
||||
docs/04-tekton-pipeline.md
|
||||
→ `solution/04-tekton-pipeline` branch created
|
||||
- Phase 5: docs/05-app-upgrade.md → `solution/05-app-upgrade` branch (deployment at 6.7.0)
|
||||
- Phase 6: Monitoring app/manifests, docs/06-monitoring.md → `solution/06-monitoring` branch
|
||||
- Phase 7: docs/vm-setup.md, README.md, docs/presentation/final-talk.md
|
||||
|
||||
**Vagrantfile fixes applied**:
|
||||
- yq arch-aware: ARCH=$(dpkg --print-architecture)
|
||||
- Tekton images: ghcr.io (not gcr.io)
|
||||
- Docker Hub images: docker.io/ prefix required for k3s ctr
|
||||
- kubeconfig chmod 600
|
||||
|
||||
**Not yet verified**: Full end-to-end smoke test pending.
|
||||
Loading…
Add table
Reference in a new issue