Update README.md
This commit is contained in:
15
README.md
15
README.md
@@ -6,10 +6,8 @@ AI-powered Kubernetes monitoring and diagnostics tool that automatically detects
|
|||||||
|
|
||||||
- 🤖 **AI-Powered Diagnostics**: Uses LLM with function calling to analyze Kubernetes issues
|
- 🤖 **AI-Powered Diagnostics**: Uses LLM with function calling to analyze Kubernetes issues
|
||||||
- 🔍 **Smart Event Detection**: Monitors pods and nodes for problems (CrashLoopBackOff, NotReady, high restarts, etc.)
|
- 🔍 **Smart Event Detection**: Monitors pods and nodes for problems (CrashLoopBackOff, NotReady, high restarts, etc.)
|
||||||
- 🧠 **Correlation Engine**: Detects patterns like mass pod failures and diagnoses the root cause (node issues) instead of individual symptoms
|
- 🧠 **Correlation Engine**: Detects patterns like mass pod failures and diagnoses the root cause (node issues)
|
||||||
- 📱 **Telegram Notifications**: Optional real-time alerts with automatic "resolved" status updates
|
- 📱 **Telegram Notifications**: Optional real-time alerts
|
||||||
- ⚡ **Concurrent Processing**: Configurable parallel AI diagnosis requests with semaphore-based rate limiting
|
|
||||||
- 🎯 **Resource-Aware**: Tracks pod CPU/memory requests and limits for better diagnostics
|
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
@@ -17,7 +15,6 @@ AI-powered Kubernetes monitoring and diagnostics tool that automatically detects
|
|||||||
2. **Correlates** issues (e.g., if 5+ pods fail on same node within 60s, diagnose the node)
|
2. **Correlates** issues (e.g., if 5+ pods fail on same node within 60s, diagnose the node)
|
||||||
3. **Analyzes** using AI with access to k8s API tools (get pod details, logs, node status)
|
3. **Analyzes** using AI with access to k8s API tools (get pod details, logs, node status)
|
||||||
4. **Notifies** via Telegram with structured diagnostic reports
|
4. **Notifies** via Telegram with structured diagnostic reports
|
||||||
5. **Tracks** issue resolution and updates notifications automatically
|
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
@@ -40,10 +37,10 @@ cp config.toml.example config.toml
|
|||||||
# API Configuration (OpenAI-compatible)
|
# API Configuration (OpenAI-compatible)
|
||||||
api_base = "http://localhost:11434/v1" # Ollama, vLLM, etc.
|
api_base = "http://localhost:11434/v1" # Ollama, vLLM, etc.
|
||||||
api_key = "your-key"
|
api_key = "your-key"
|
||||||
model = "qwen3-tools:latest"
|
model = "qwen3-tools:latest" # any model supporting tools
|
||||||
|
|
||||||
# Concurrency
|
# Concurrency
|
||||||
max_concurrent_diagnoses = 1
|
max_concurrent_diagnoses = 1 # parallel OpenAI API requests
|
||||||
|
|
||||||
# Telegram (optional)
|
# Telegram (optional)
|
||||||
telegram_bot_token = "your-bot-token"
|
telegram_bot_token = "your-bot-token"
|
||||||
@@ -66,7 +63,7 @@ KUBECONFIG=/path/to/kubeconfig cargo run --release
|
|||||||
## Requirements
|
## Requirements
|
||||||
|
|
||||||
- Rust 1.70+
|
- Rust 1.70+
|
||||||
- Kubernetes cluster access (via kubeconfig)
|
- Kubernetes cluster access (via kubeconfig or kubernetes service account)
|
||||||
- OpenAI-compatible LLM endpoint with function calling support
|
- OpenAI-compatible LLM endpoint with function calling support
|
||||||
- Tested with: Ollama (qwen3-tools, devstral-small-2)
|
- Tested with: Ollama (qwen3-tools, devstral-small-2)
|
||||||
- Should work with: OpenAI, vLLM, etc.
|
- Should work with: OpenAI, vLLM, etc.
|
||||||
@@ -89,4 +86,4 @@ KUBECONFIG=/path/to/kubeconfig cargo run --release
|
|||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
MIT
|
WTFPL
|
||||||
|
|||||||
Reference in New Issue
Block a user