Skills › Business & Commerce › Dashboards & analytics
dashboard-builder
Build monitoring dashboards that answer real operator questions for Grafana, SigNoz, and similar platforms. Use when turning metrics into a working dashboard instead of a vanity board.
The full skill
—
name: dashboard-builder
description: Build monitoring dashboards that answer real operator questions for Grafana, SigNoz, and similar platforms. Use when turning metrics into a working dashboard instead of a vanity board.
origin: ECC direct-port adaptation
version: "1.0.0"
—
# Dashboard Builder
Use this when the task is to build a dashboard people can operate from.
The goal is not "show every metric." The goal is to answer:
– is it healthy?
– where is the bottleneck?
– what changed?
– what action should someone take?
## When to Use
– "Build a Kafka monitoring dashboard"
– "Create a Grafana dashboard for Elasticsearch"
– "Make a SigNoz dashboard for this service"
– "Turn this metrics list into a real operational dashboard"
## Guardrails
– do not start from visual layout; start from operator questions
– do not include every available metric just because it exists
– do not mix health, throughput, and resource panels without structure
– do not ship panels without titles, units, and sane thresholds
## Workflow
### 1. Define the operating questions
Organize around:
– health / availability
– latency / performance
– throughput / volume
– saturation / resources
– service-specific risk
### 2. Study the target platform schema
Inspect existing dashboards first:
– JSON structure
– query language
– variables
– threshold styling
– section layout
### 3. Build the minimum useful board
Recommended structure:
1. overview
2. performance
3. resources
4. service-specific section
### 4. Cut vanity panels
Every panel should answer a real question. If it does not, remove it.
## Example Panel Sets
### Elasticsearch
– cluster health
– shard allocation
– search latency
– indexing rate
– JVM heap / GC
### Kafka
– broker count
– under-replicated partitions
– messages in / out
– consumer lag
– disk and network pressure
### API gateway / ingress
– request rate
– p50 / p95 / p99 latency
– error rate
– upstream health
– active connections
## Quality Checklist
– [ ] valid dashboard JSON
– [ ] clear section grouping
– [ ] titles and units are present
– [ ] thresholds/status colors are meaningful
– [ ] variables exist for common filters
– [ ] default time range and refresh are sensible
– [ ] no vanity panels with no operator value
## Related Skills
– `research-ops`
– `backend-patterns`
– `terminal-ops`