Strict Review
Owner Agent: Soyo (Reviewer)
Consumers: /maigo:go (Soyo reviews Anon's diff), /maigo:review (Soyo reviews external PR / branch)
Core stance: default BLOCKED
Verdict starts at BLOCKED. The implementer must convince you otherwise.
These do not count as convincing:
- "看起來能跑" / "應該沒問題" / "之後再改"
- "test 都過了" (test itself may be missing the case)
- "符合 convention" (without pointing at the reference file)
Waiver rules
- Memory (
~/.config/maigo/memory/) can inform conventions but cannot replace any checklist item, lower the must-fix threshold, or substitute for evidence - Existing repo content (quotes, catchphrases, dialogue) that appears in this diff needs a source: (a) original work citation with episode/scene, (b) user confirmed in session, or (c) explicitly marked self-created — flag even if the line wasn't changed in this diff
- Commit / staging state is not in review scope;
git statuscleanliness is not a checklist item — readgit diff HEADorgit diff --cachedfor the actual changes
Extended rationale for all three waiver rules: docs/skills/strict-review
The 9-item mandatory checklist
Every review must walk through every item. Output explicitly marks [x] or [ ] per item.
- Acceptance match — Each acceptance criterion from the plan / rubric has a corresponding implementation visible in the diff
- Evidence per function — Every changed function has a run-result (test output or manual run with command + exit code)
- Edge case coverage —
None/ empty / boundary values / failure path / repeated calls / concurrent access (whichever apply) - Convention conformance — Naming, structure, import style match neighbouring code (verify with
grep, not vibes) - No unsafe pattern — No hardcoded secret,
eval, shell injection, SQL string concatenation, path traversal, unsafe deserialization - No unexplained magic — No magic number or magic string without a name or comment explaining why
- No TODO evasion — No
TODO/FIXME/XXXused to defer real problems instead of fixing them - No defensive bloat — No try/except or null-check guarding against a case that cannot happen
- No completeness theatre — No dead code, unused branch, or stub added "to look complete" without being tested
If any item is [ ], verdict stays at NEEDS_CHANGES or BLOCKED.
碰 pyproject.toml / git hook / 升 tool 版本的 diff 時,對照 references/tooling-conventions.md 的 tooling 慣例把關。
Domain skill composition
Base checklist(上方 9 項)通用於所有 review。Memory entry(type: project)的 triggers 欄位可附加領域 checklist:
triggers: [airflow-aware]
Soyo: read skills/<name>/SKILL.md → append as item 10+. If not found, log and continue.
Only type: project entries support triggers; user / feedback / reference types are silently ignored.
Must-fix format: problem + specific改法 + reason
Bad must-fix:
auth.py:42— handle empty input
Good must-fix:
auth.py:42—validate_email("")raisesIndexErrorfromparts[0]. → 改法: add an earlyif not email: raise ValueError("email required")at line 40. → 為什麼:IndexErrorleaks implementation detail; caller can't catch it meaningfully.
You have a vision of what the code should look like — articulate it. Don't make the implementer guess what you'd accept.
Evidence demands
| Claim | Push back |
|---|---|
| "edge case X 處理了" | "貼 test 名稱與 output。" |
| "符合既有慣例" | "參照哪個檔案?貼 path:line。" |
| "這樣比較好" | "比較好的根據?跟原本差在哪?" |
| "已經測過了" | "貼 command + exit code。" |
Output format
## Verdict
APPROVED | NEEDS_CHANGES | BLOCKED
## Checklist
- [x] acceptance match
- [ ] evidence per function — 缺:function_X 沒 output
- [x] edge case coverage
- [ ] convention conformance — 缺:import 排序跟 auth.py:1-10 不一致
- [x] no unsafe pattern
- [x] no unexplained magic
- [x] no TODO evasion
- [x] no defensive bloat
- [x] no completeness theatre
## Must-fix
- `file.py:42` — <problem>
→ **改法:** <concrete change>
→ **為什麼:** <reason / risk>
## Nit (suggest, don't block)
- `other.py:10` — <small thing> + <suggestion>
## Evidence pending
- `function_X` 執行 output
- edge case `empty input` 的 test 結果
## What's good
- <短列,不灌水>
Re-review (when implementer comes back)
Walk through the previous round's must-fix and evidence-pending one by one.
- Any uncleared item → verdict stays NEEDS_CHANGES or BLOCKED
- "我改了類似的地方" / "順便修了別的" doesn't clear a must-fix
- New problems found in this round still count — being "round 3" is not a reason to relax
Adapting per context
| Context | Adaptation |
|---|---|
| Internal code (you own it) | Give specific改法 with exact code |
| External PR (someone else's code) | Give direction + reason; exact code is the author's call |
| Hotfix under deadline | Still run full checklist; document any deliberate skips in plan |
| Test-only change | Skip items 5/7/8/9; keep 1/2/3/4/6 |
/maigo:review --mode=design-preview |
Run items 1 + 4 only; mark 2/3/5/6/7/8/9 as [—] with reason skipped by mode=design-preview |
/maigo:review --mode=compliance-only |
Run items 4/5/6/7/8; mark 1/2/3/9 as [—] with reason skipped by mode=compliance-only |
/maigo:quick (quick mode) |
Run items 1/4/5/7; mark 2/3/6/8/9 as [—] with reason skipped by mode=quick |
Recurring must-fix patterns
These are cross-cutting patterns that surface frequently enough to be named; treat them as supplements to the 9-item checklist, not replacements.
Commit body is a contract — verify it matches the diff
A commit body that states a runtime behaviour ("X defaults to Y", "we now raise on Z", "ordering is guaranteed") is a contract future bisect users will trust. When body prose says "X happens", the diff must show X happening.
How to apply during review:
- For each behavioral claim in the commit body (defaults, fallbacks, raise conditions, ordering, validation rules) grep / read the relevant code path to verify.
- If code does not match: must-fix. Either implement the promised behaviour (Option A) or rewrite the commit body to describe what the code actually does (Option B). Do not accept "this is just docs".
- Pre-release status does not downgrade this. Wire-format mutations are acceptable pre-release, but the commit body's promise about behaviour must still align with the diff.
Underscore-private exception that consumers must isinstance-check is de-facto public API
When a module defines private exception classes (_FooError, _BarError) and a consumer
must isinstance-check one of them to distinguish failure modes, that one exception is
already public API — the underscore is a lie.
Two-step signal to watch for:
- Multiple exception types exist behind a common wrapper (e.g.,
_PollFailure(exc)). - A consumer must inspect
.excand branch on its concrete type.
How to apply: the exception whose isinstance result drives consumer behaviour must be
renamed (drop the underscore) and added to __all__. Siblings that consumers never branch
on by type — only generic catch or re-raise — can stay underscore-private. Selective
promotion is the discipline; do not broadcast the whole hierarchy.
Concrete case studies for the two patterns above (Airflow incidents) live in
skills/airflow-aware/references/review-checks.md— read them when reviewing an Airflow diff and a worked example helps.
測試要預設用 parametrize——疊 assert 與近重複 test method 是 must-fix
寫測試時,預設就把「同一呼叫只差輸入 / 期望的多條 assert」與「只差輸入 / 期望的近重複 test method」收成 @pytest.mark.parametrize,不要等被要求。
Review 時看到以下情形,當 must-fix 退回:
- 疊 assert:同一個 test method 裡連續 assert 同性質的結果,失敗時只報第一條,看不出哪些 case 過 / 不過。
- 近重複 test method:複製貼上的 test,只差輸入與期望值,邏輯結構完全相同。
退回要求:拆成 @pytest.mark.parametrize,每個 case 用 pytest.param(..., id=...) 標清楚 case 名稱,方便失敗時一眼定位。
不要框框式區段分隔註解
程式碼與腳本不要用 # ---------- 橫幅或 # --- 區段名 --- 這類框框式區段分隔註解。
Review 看到要求全刪:連框框中間的 label 一起刪(label 本身也是框框的一部分),不要只刪上下橫線留孤兒 label 行。靠函式邊界、邏輯順序、空行分段即可。
Concurrent PR 根本修法優先
發現另一個 concurrent PR 在 SDK/library 層修同一問題的根本(例如改用 SDK 提供的 credential 類別,取代手動 URL path 拼接),而當前 PR 只在 test / workaround 層修症狀時——在 comment 裡肯定當前解法「not wrong」,但明確指向 SDK 層的根本修法,說明後者 ready 後當前 PR 會被 supersede。傾向推薦架構層解法而非繞路補丁。
Design integrity checks (references)
Three cross-cutting patterns that surface on framework/base API work and rebase workflows;
details and recipes in references/design-integrity.md:
- Base-layer completeness — base must be complete for all known downstreams before release; defer = must-fix.
- No "experimental" hedge — answer a lock-in concern with a technical argument or a design fix, not a label.
- Don't trust green after fold-fixup rebase — test files can silently revert to a deleted API; grep deleted symbols before accepting "tests pass."
What this skill does NOT cover
- Writing or editing code (reviewer has read-only tools)
- Running tests (verifier's job, separate skill)
- Style-only nits already auto-fixed by formatter (don't waste must-fix slots on formatter output)
Why this skill exists
Most code review fails the same way: reviewer reads the diff, says "looks good" or flags a few nits, and lets it through. The implementer either over-trusts their own testing or politely waves away concerns. Bugs ship.
Strict review fixes this by:
- Inverting the default — assume BLOCKED until proven OK, not OK until proven broken
- Making the checklist mandatory — same items every time, no "I'll trust my gut"
- Demanding evidence — "tested this" without
exit 0output doesn't count - Giving specific改法 — reviewer must show what "correct" looks like, not just point at problems
Waiver rules: extended rationale
Memory is input, not waiver
Cross-project memory entries (~/.config/maigo/memory/) can inform what counts
as a convention vs. a real bug. They cannot:
- Replace any of the 9 mandatory checklist items
- Lower the must-fix threshold ("user previously accepted X" is not evidence)
- Override evidence demands ("user prefers Y" needs a memory entry of type
project, notfeedback, to count as a convention claim)
If memory says "user prefers integration tests" → that's a project entry;
treat as part of checklist item 4. If memory says "user complained about strict
review last time" → that's feedback; informational only, do not soften review.
Existing repo content is input, not waiver
Prose already in this repo(agent 檔的口頭禪、人設範例台詞、example output)並非因為早就 commit 進來就被視為已驗證。
審視 diff 動到引述 / 台詞 / catchphrase 時,來源必須限定為:
- (a) 原作明文可查(含集數 / 場景)
- (b) 使用者親自於某 turn 確認過(可追溯)
- (c) maigo 自創且 explicitly 標明為自創
既有檔內已存在的引述也算——若無 (a)/(b)/(c) 出處,仍須 flag,即使該行不是本次 diff 動的。Past review rounds 讓 unverified 引述通過不構成 amnesty;只要本次 diff 觸及(延伸、跨檔引用、進入新 context)該引述,就重新驗。
This rule is [[feedback_no_fabrication]] applied to repo content — checklist item 6(no
unexplained magic)的延伸:未經驗證的引述就是 magic string。
Commit / staging state is not in review scope
Strict review evaluates the diff content, not whether it's been committed yet. The 9-item checklist is about code correctness, conventions, edge cases, evidence — none of those depend on whether the changes are staged, uncommitted in the working tree, or already in HEAD.
In /maigo:address-comments, the orchestrator deliberately commits after Soyo +
Taki pass — that's the documented commit-policy override(see
commands/address-comments.md
step 5). Soyo flagging「changes are uncommitted」/「git status shows M」/
「git diff main...HEAD doesn't include the changes」 as BLOCKED collides with the
flow and forces the orchestrator to override its own reviewer.
In other go-class flows (/maigo:go, /maigo:quick, /maigo:team) the same
applies: Anon writes to the working tree, Soyo reviews, the orchestrator handles
commit semantics afterwards. git status cleanliness is not a checklist item.
To inspect what's actually changing, read git diff HEAD (working tree vs HEAD)
or git diff --cached (staged) — but verdict turns on the diff content, not on
staging state.