The Observability Blind Spot
Your application traces show you the full journey of a request: from the API gateway through service meshes, database queries, and cache lookups. Every hop is instrumented. Every latency spike is visible.
Then an AI agent makes a decision. Your trace shows the agent was called and that it returned a response. What happened in between, the governance evaluation, the policy checks, the decision rationale, is a black box. If the agent was throttled by a governance policy, your application trace just shows an unexplained delay. If a request was denied, your team sees a 403 with no context about which policy triggered it or why.
Today we are launching OpenTelemetry (OTEL) integration for MeshGuard, closing the observability gap between your application layer and your governance layer.
What Gets Traced
When OTEL export is enabled, MeshGuard emits spans for every governance operation:
- Policy evaluation spans showing which policies were checked, the evaluation result, and the time spent in each rule.
- Agent registration and heartbeat spans for tracking agent lifecycle events.
- Alert dispatch spans recording when alerts are triggered and how long delivery takes to each channel.
- Audit log write spans capturing the latency and outcome of compliance logging.
Each span carries attributes that make filtering and correlation straightforward:
meshguard.agent.id: "ag_92kLmx"
meshguard.agent.name: "order-processor-v2"
meshguard.policy.name: "rate-limit-orders"
meshguard.policy.decision: "allow"
meshguard.evaluation.duration_ms: 4.2
meshguard.workspace: "prod"
Enabling the Integration
Configure the OTEL exporter in your MeshGuard workspace settings or via the CLI:
meshguard otel configure \
--endpoint https://otel-collector.internal:4317 \
--protocol grpc \
--workspace prod
MeshGuard supports the OTLP protocol over both gRPC and HTTP, so it works with any OTEL-compatible backend: Jaeger, Grafana Tempo, Datadog, Honeycomb, New Relic, or your own collector.
For SDK-level tracing, enable the OTEL hook when initializing the client:
from meshguard import Client
client = Client(
api_key="mg_api_key_here",
otel_enabled=True,
otel_endpoint="https://otel-collector.internal:4317",
)
Once enabled, every SDK call automatically creates child spans under your application's active trace context. A single distributed trace now shows the full picture: your application logic, the governance evaluation, and the downstream effects.
Why This Matters for Governance
Observability is not just an operational convenience. It directly strengthens your governance posture in three ways.
Incident response. When an agent behaves unexpectedly, you can trace the governance decision chain in the same tool your on-call team already uses. No context switching between dashboards. The policy evaluation is right there in the trace, next to the application span that triggered it.
Performance budgeting. Governance adds latency. With OTEL spans, you can measure exactly how much. If policy evaluations are adding 50ms to a latency-sensitive path, you can identify which rules are expensive and optimize them. Without tracing, that latency is invisible overhead.
Audit correlation. Compliance teams can link a specific governance decision to the exact application request that triggered it, complete with timestamps, trace IDs, and the full decision rationale. This level of traceability is increasingly expected by auditors reviewing AI system governance.
Dashboards and Alerts
MeshGuard ships pre-built dashboard templates for Grafana and Datadog that visualize governance trace data out of the box:
- Policy evaluation latency (p50, p95, p99)
- Deny rate by agent and policy
- Alert dispatch success rate
- Governance overhead as a percentage of total request latency
Import the templates from docs.meshguard.app/otel/dashboards and point them at your trace backend.
Get Started
OTEL export is available now on all plans. Enable it in your workspace settings or follow the integration guide. If you are already running an OpenTelemetry collector, setup takes less than five minutes.
Your governance layer should be as observable as the rest of your stack. Now it is.