2026-05-10
Kiali: a console for service meshes
Kiali is the management console for Istio service mesh. It’s the screen you stare at when something is wrong on the mesh and you don’t yet know which service, which workload, which version, or which config rule is at fault. It pulls together Kubernetes metadata, Istio configuration, Prometheus metrics, and (optionally) distributed traces, and presents them as one unified view.
It is not a metrics store, a tracing backend, or a control plane. It’s a thin, opinionated UI on top of all of those.
The problem it solves
A service mesh adds a lot of moving parts: sidecar proxies, an istiod control plane, and a fan of CRDs (VirtualService, DestinationRule, AuthorizationPolicy, PeerAuthentication, Sidecar, Gateway, …) that interact in non-obvious ways. The same configuration knobs that give you canary releases and zero-trust mTLS also make the system hard to reason about under load.
Without a console, debugging becomes a multi-tool dance:
kubectl get virtualservice -Ato find the relevant configistioctl analyzeto see if it’s valid- A Prom query to check whether traffic is actually flowing
- A Jaeger search to see what an end-to-end request looks like
kubectl execinto a sidecar to dump its Envoy config when nothing else explains the behavior
Kiali is the unification of those views. It won’t replace any of them entirely — but it’ll cut the time-to-find-it for ~80% of mesh issues.
What you actually do in Kiali
In rough order of how often each gets used:
- Service graph — the live topology of your mesh. Nodes are services, workloads, or apps; edges are the actual traffic, annotated with rate and error rate. Health is color-coded.
- Workload health — click any node to see its replicas, sidecar version, recent errors, and which Istio configs are touching it.
- Config validation — Kiali continuously validates Istio CRDs and flags conflicts: a
VirtualServicerouting to a subset that doesn’t exist in anyDestinationRule, two policies fighting over the same workload, etc. This alone justifies the install. - Traffic shifting wizards — define canaries, A/B, fault injection, timeouts via a UI that generates the YAML for you.
- mTLS overview — see which workloads are missing
PeerAuthentication, where mTLS is forced vs. permissive, where you have plaintext traffic you didn’t expect. - Drill-out to traces and dashboards — every workload page links into the relevant Jaeger search and Grafana dashboard.
How Kiali gets its data
Kiali doesn’t generate telemetry. It’s a read-mostly aggregator that queries the systems already on your cluster:
Reading the diagram:
- Solid arrows are query paths Kiali walks: Kubernetes API for topology and Istio CRDs, Prometheus for sidecar metrics, Jaeger/Tempo for traces.
- Dashed arrows are the mesh’s own data plane:
istiodwatches K8s for config changes and pushes them to every sidecar via xDS. Sidecars are scraped by Prometheus for metrics and emit spans to the tracing backend. - Kiali sits one layer above all of that as a pane of glass. It writes back to the K8s API only when you use a wizard to create or edit Istio CRDs.
Click ⛶ to view the diagram full-screen.
What the service graph looks like
A typical Kiali graph view on a small mesh, with health indicated by border color:
In the real UI, edges show RPS and error % live (e.g. 12 rps, 0.3% 5xx), and nodes can be grouped by app, version, or workload. The view updates every 15s.
Strengths
- Tight Istio integration. Wizards generate the exact YAML you’d otherwise hand-write — and validate it before applying.
- Live state, not declared state. The graph reflects actual traffic over a recent window, not just what’s deployed. That difference catches a lot of “config looks right but nothing’s flowing” bugs.
- Validation catches real bugs.
VirtualServicereferencing a non-existent subset, conflictingAuthorizationPolicy, mTLS misconfig — Kiali surfaces these without you knowing to look. - Free, Apache-2.0, CNCF-incubating. No vendor lock; runs entirely in your cluster.
Limitations
- Istio-only. Linkerd users have Buoyant Cloud /
linkerd viz. Cilium service mesh users have Hubble UI. Kiali doesn’t help you on those. - Prometheus-bound. Without Prometheus (or a compatible remote-write target), most of Kiali’s views go blank — the graph generation, health, and validation all rely on metric queries.
- Cost on large meshes. Graph generation queries Prom across the selected namespaces and time window. On a 1000-service mesh, the wide-graph + long-window combination is expensive. In practice you scope by namespace and keep the time window short.
- Not a tracing tool. Kiali shows topology and aggregate metrics; for “what happened to this exact request,” you still need Jaeger or Tempo. Kiali just links you there.
- Power-user configs still need YAML. EnvoyFilters, complex routing trees, and lower-level mesh tuning aren’t in any wizard.
Kiali on OpenShift
If you’re on OpenShift, Service Mesh ships Kiali bundled with istiod, Jaeger, and Prometheus, all provisioned through the Service Mesh operator. The trade-off is being a release or two behind upstream Istio while Red Hat curates the bundle — for most enterprise users this is a feature, not a constraint. The Kiali experience itself is the same as upstream.
Where to start
- Install Istio + Kiali via the Kiali operator, the helm chart, or (on OpenShift) the Service Mesh operator.
- Deploy a sample app like
bookinfoand watch the graph populate. - Switch the graph to versioned-app view to see how Istio sees your services.
- Open the Istio Config page on day one — even on a clean install, Kiali usually flags something you didn’t know about.
- Add your own namespaces to the mesh one at a time. The graph will tell you immediately when sidecar injection is missing or when traffic isn’t going through the proxy.
The common Kiali mistake is treating it as just a topology dashboard. The real value is in the validation and the wizards — those are what save you at 3am when a production rollout is going sideways and the YAML you applied isn’t doing what you expected.