agent init
This commit is contained in:
45
skills/debugging/incident-response-stabilization.md
Normal file
45
skills/debugging/incident-response-stabilization.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Incident Response and Stabilization
|
||||
|
||||
## Purpose
|
||||
|
||||
Guide high-pressure response to live or high-impact issues by separating immediate stabilization from deeper root-cause correction.
|
||||
|
||||
## When to use
|
||||
|
||||
- A production issue is actively impacting users or operators
|
||||
- A regression needs containment before a complete fix is ready
|
||||
- The team needs a calm sequence for triage, mitigation, and follow-up
|
||||
- Communication and operational clarity matter as much as code changes
|
||||
|
||||
## Inputs to gather
|
||||
|
||||
- Current symptoms, severity, affected users, and timing
|
||||
- Available logs, metrics, alerts, dashboards, and recent changes
|
||||
- Safe rollback, feature flag, degrade, or traffic-shaping options
|
||||
- Stakeholders who need updates and what they need to know
|
||||
|
||||
## How to work
|
||||
|
||||
- Stabilize user impact first if a safe containment path exists.
|
||||
- Keep mitigation, diagnosis, and communication distinct but coordinated.
|
||||
- Prefer reversible steps under uncertainty.
|
||||
- Record what is confirmed versus assumed while the incident is active.
|
||||
- After stabilization, convert the incident into structured debugging and prevention work.
|
||||
|
||||
## Output expectations
|
||||
|
||||
- Stabilization plan or incident response summary
|
||||
- Clear mitigation status and next actions
|
||||
- Follow-up work for root cause, observability, and prevention
|
||||
|
||||
## Quality checklist
|
||||
|
||||
- User impact reduction is prioritized appropriately.
|
||||
- Risky irreversible changes are avoided under pressure.
|
||||
- Communication is clear enough for collaborators to act.
|
||||
- Post-incident follow-up is not lost after immediate recovery.
|
||||
|
||||
## Handoff notes
|
||||
|
||||
- Note what was mitigated versus actually fixed.
|
||||
- Pair with debugging workflow and observability once the system is stable enough for deeper work.
|
||||
Reference in New Issue
Block a user