Skip to content

JavaJourneyer/ai-pod-explainer

Repository files navigation

AI-Powered Pod Explainer

A tiny AI + Infra CLI: inspects your Kubernetes pods and asks an LLM to produce an SRE-style, actionable health summary.

demo

Why?

  • Showcases platform engineering + AI in a compact demo.
  • Demo-friendly: works with a live cluster or a provided sample_pods.json.
  • LinkedIn-ready: screenshot the deterministic + AI summaries side-by-side.

Quickstart (2–10 minutes)

# 1) Create a venv and install deps
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 2) (Option A) Run against your current K8s context
python ai_pod_explainer.py -A --provider openrouter --model "anthropic/claude-3.5-haiku"

#    (Option B) No cluster? Use the sample data
python ai_pod_explainer.py --from-file sample_pods.json --no-ai

# 3) Configure AI (optional, for the AI summary)
#    OpenRouter:
export OPENROUTER_API_KEY=YOUR_KEY

#    OpenAI:
# export OPENAI_API_KEY=YOUR_KEY

# Re-run with AI enabled:
python ai_pod_explainer.py --from-file sample_pods.json --provider openrouter --model "anthropic/claude-3.5-haiku"

Tip: If you don't want AI at all, just use --no-ai and screenshot the deterministic summary.

What It Does

  • Executes kubectl get pods -A -o json (or loads sample_pods.json).
  • Aggregates: phase counts, NotReady pods, CrashLoopBackOff, ImagePullBackOff, OOMKilled, high restarts.
  • Highlights "hot" namespaces and example pods.
  • (Optional) Sends a compact snapshot to your LLM provider for an SRE-style summary.

Output Example

=== Deterministic Summary (no AI) ===
Quick health summary:
- Total pods: 42 (by phase: Running:39, Pending:2, Failed:1)
- Not Ready pods: 4
- CrashLoopBackOff: 2 (check config/env/image)
- Image pull issues: 1 (ImagePullBackOff/ErrImagePull)
- High restarts (>=3): 3
- Hot namespaces:
  - payments: total=10, not_ready=3, crashloop=1, pending=0
  - auth: total=6, not_ready=1, crashloop=1, pending=1
- Sample pods:
  - payments/api-7d9f... phase=Running ready=False restarts=5 age=2h issues=CrashLoopBackOff
  - auth/worker-9c8... phase=Pending ready=False restarts=0 age=12m
  ...
=== AI Summary ===
• Config regression in `payments` likely causing CrashLoopBackOff on `api`.  
• One image pull failure suggests tag mismatch or registry access issue.  
• Multiple pods restarting ≥3 times hint at unstable startup (env/secrets).  
Next:  
1) `kubectl -n payments describe pod <pod>` and check env/secrets diff.  
2) Verify image tags & credentials; `kubectl -n <ns> get events | grep -i image`.  
3) For OOMKilled, raise memory requests/limits and inspect heap.  

Built on 2025-10-11.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages