Sign in to edit tickets from this page.

← all tickets · home

k8s rollout: chukwa pod hangs in Terminating under strategy=Recreate

rejected 184cdd39-219a-4143-af8e-684720fc79b5

created_at
2026-04-26
updated_at
2026-04-26
priority
P3
ticket_type
chore
labels
k8s, deploy
resolved_at
2026-04-26
resolution
rejected

Body

Surfaced from scenario-store ticket 7d14ef0b Phase H deploy.

The chukwa Deployment uses strategy: Recreate (single-writer kernel — no parallel replicas). On the Phase H deploy, the prior pod (chukwa-67758ff9cd-r5qtv) hung in Terminating for ~115s, blocking the new pod from scheduling. Kubelet reported no specific reason. Required kubectl delete pod --grace-period=0 --force to clear, after which the new pod came up immediately and rolled cleanly.

This recurred on the post-Phase-J hash-fix deploy too — a different prior pod hung in Terminating ~30s before resolving on its own. Either intermittent or correlated with in-flight HTTP connections to /tickets/watch (which are long-lived NDJSON streams).

Proposed fix:

Acceptance:

History (4 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.