Skip to main content

Overview

The Secure MCP Gateway uses OpenTelemetry as its primary observability framework, providing:
  • Structured Logging via OTLP log export to Loki
  • Distributed Tracing with context propagation to Jaeger
  • Metrics Collection with Prometheus export
  • Unified Telemetry through the OpenTelemetry Collector

OpenTelemetry Provider

The OpenTelemetryProvider implements full OpenTelemetry support: Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py

Key Features

  • OTLP Export: gRPC and HTTP protocols supported
  • Resource Attributes: Service name, version, environment metadata
  • Batch Processing: Efficient batching of telemetry data
  • Connectivity Check: Automatic endpoint validation on startup
  • Graceful Degradation: Falls back to no-op if collector unavailable

Provider Implementation

class OpenTelemetryProvider(TelemetryProvider):
    def __init__(self, config: dict[str, Any] | None = None):
        self._initialized = False
        self._logger = None
        self._tracer = None
        self._meter = None
        self._resource = None
        
        if config:
            self.initialize(config)
    
    def initialize(self, config: dict[str, Any]) -> TelemetryResult:
        # Extract configuration
        enabled = self._check_telemetry_enabled(config)
        endpoint = config.get("url", "http://localhost:4317")
        insecure = config.get("insecure", True)
        service_name = config.get("service_name", "secure-mcp-gateway")
        job_name = config.get("job_name", "enkryptai")
        
        if enabled:
            self._setup_enabled_telemetry(
                endpoint, insecure, service_name, job_name, config
            )
        else:
            self._setup_disabled_telemetry()
        
        self._initialized = True
        return TelemetryResult(success=True, provider_name=self.name)

Configuration

Basic Configuration

Add telemetry configuration to enkrypt_mcp_config.json:
{
  "plugins": {
    "telemetry": {
      "provider": "opentelemetry",
      "config": {
        "enabled": true,
        "url": "http://localhost:4317",
        "insecure": true,
        "service_name": "secure-mcp-gateway",
        "job_name": "enkryptai"
      }
    }
  },
  "common_mcp_gateway_config": {
    "enkrypt_log_level": "INFO"
  }
}

Configuration Options

enabled
boolean
default:"true"
Enable OpenTelemetry telemetry. When false, uses no-op implementations.
url
string
default:"http://localhost:4317"
OTLP endpoint URL. Supports:
  • gRPC: http://localhost:4317 (default)
  • HTTP: http://localhost:4318
insecure
boolean
default:"true"
Use insecure connection (no TLS). Set to false for production with TLS.
service_name
string
default:"secure-mcp-gateway"
Service name in resource attributes. Used for filtering in Grafana/Jaeger.
job_name
string
default:"enkryptai"
Job name for Prometheus metrics and resource attributes.

Production Configuration

For production deployments with TLS:
{
  "plugins": {
    "telemetry": {
      "provider": "opentelemetry",
      "config": {
        "enabled": true,
        "url": "https://otel-collector.example.com:4318",
        "insecure": false,
        "service_name": "secure-mcp-gateway-prod",
        "job_name": "production"
      }
    }
  },
  "common_mcp_gateway_config": {
    "enkrypt_log_level": "WARNING"
  }
}

OpenTelemetry Collector Setup

Using Docker Compose

The gateway includes a complete observability stack:
cd infra/
docker-compose up -d
This starts:
  • OpenTelemetry Collector (ports 4317, 4318, 8889)
  • Jaeger (port 16686)
  • Loki (port 3100)
  • Prometheus (port 9090)
  • Grafana (port 3000)

Collector Configuration

Location: infra/otel_collector/otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

exporters:
  # Traces to Jaeger
  otlp:
    endpoint: jaeger:4317
    tls:
      insecure: true
  
  # Logs to Loki
  otlphttp/loki:
    endpoint: "http://loki:3100/otlp"
    tls:
      insecure: true
  
  # Metrics to Prometheus
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: "otel"
    const_labels:
      service_name: "secure-mcp-gateway"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp, debug]
    
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus, debug]
    
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/loki, debug]

Manual Installation

If not using Docker Compose:
# Download collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.134.1/otelcol-contrib_0.134.1_linux_amd64.tar.gz
tar -xzf otelcol-contrib_0.134.1_linux_amd64.tar.gz

# Create config file (use example above)
vim otel-collector-config.yaml

# Run collector
./otelcol-contrib --config=otel-collector-config.yaml

Distributed Tracing

Trace Context Propagation

The gateway automatically propagates trace context across operations:
from secure_mcp_gateway.utils import logger
from secure_mcp_gateway.plugins.telemetry import get_telemetry_config_manager

telemetry_manager = get_telemetry_config_manager()
tracer = telemetry_manager.get_tracer()

# Create a span
with tracer.start_as_current_span("tool_execution") as span:
    span.set_attribute("server_name", "github_server")
    span.set_attribute("tool_name", "create_issue")
    span.set_attribute("user_id", user_id)
    
    # Execute operation
    result = execute_tool(server_name, tool_name, args)
    
    # Add result attributes
    span.set_attribute("success", result.success)
    span.set_attribute("duration_ms", result.duration)

Trace Attributes

Common attributes used in gateway traces:
AttributeTypeDescription
server_namestringMCP server name
tool_namestringTool being executed
user_idstringUser identifier
project_idstringProject identifier
custom_idstringRequest correlation ID
duration_msintOperation duration
successbooleanOperation success status
error_typestringError type if failed

Viewing Traces in Jaeger

  1. Open Jaeger UI: http://localhost:16686
  2. Select service: secure-mcp-gateway
  3. Search by:
    • Operation name (e.g., tool_execution)
    • Tags (e.g., server_name=github_server)
    • Duration (e.g., slow traces > 1s)

Trace Examples

Tool Execution Trace:
tool_execution [250ms]
├── authenticate [10ms]
│   └── cache_lookup [2ms]
├── input_guardrails [30ms]
│   ├── pii_detection [15ms]
│   └── toxicity_check [15ms]
├── forward_to_server [180ms]
│   ├── discover_tools [50ms]
│   └── call_tool [130ms]
└── output_guardrails [30ms]
    ├── relevancy_check [15ms]
    └── adherence_check [15ms]

OTLP Export

gRPC Export (Default)

Default configuration uses gRPC:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter

# Traces
otlp_exporter = OTLPSpanExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Metrics
metric_exporter = OTLPMetricExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Logs
log_exporter = OTLPLogExporter(
    endpoint="localhost:4317",
    insecure=True
)
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:349

HTTP Export

To use HTTP instead of gRPC, configure endpoint with port 4318:
{
  "plugins": {
    "telemetry": {
      "config": {
        "url": "http://localhost:4318"
      }
    }
  }
}
The provider automatically selects the appropriate exporter based on the port.

Resource Attributes

Resource attributes identify the telemetry source:
from opentelemetry.sdk.resources import Resource

self._resource = Resource(
    attributes={
        "service.name": "secure-mcp-gateway",
        "job": "enkryptai",
        "service.version": "2.1.2",
        "deployment.environment": "production"
    }
)
These attributes appear in all logs, traces, and metrics. Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:321

Connectivity Check

Before enabling telemetry, the provider validates endpoint reachability:
def _check_telemetry_enabled(self, config: dict[str, Any]) -> bool:
    """Check if telemetry is enabled and endpoint is reachable."""
    if not config.get("enabled", False):
        return False
    
    endpoint = config.get("url", "http://localhost:4317")
    parsed_url = urlparse(endpoint)
    hostname = parsed_url.hostname
    port = parsed_url.port
    
    try:
        # Get timeout from TimeoutManager
        from secure_mcp_gateway.services.timeout import get_timeout_manager
        timeout_manager = get_timeout_manager()
        timeout_value = timeout_manager.get_timeout("connectivity")
        
        # Test connection
        with socket.create_connection((hostname, port), timeout=timeout_value):
            logger.debug(f"OTLP endpoint {endpoint} is reachable")
            return True
    except (OSError, AttributeError, TypeError, ValueError) as e:
        logger.error(
            f"Telemetry enabled but endpoint {endpoint} unreachable. "
            f"Disabling telemetry. Error: {e}"
        )
        return False
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:152 This ensures the gateway starts successfully even if the collector is unavailable.

Metrics Export

Metrics are exported via Prometheus exporter:
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader

# Create exporter
otlp_exporter = OTLPMetricExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Create reader with 5-second export interval
reader = PeriodicExportingMetricReader(
    otlp_exporter,
    export_interval_millis=5000
)

# Create meter provider
provider = MeterProvider(
    resource=self._resource,
    metric_readers=[reader]
)
metrics.set_meter_provider(provider)

# Get meter
self._meter = metrics.get_meter("enkrypt.meter")
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:358 The collector receives OTLP metrics and exports them to Prometheus on port 8889.

Integration with Services

Grafana Integration

Datasource Configuration: Location: infra/grafana/provisioning/datasources/datasources.yaml
apiVersion: 1

datasources:
  - name: Loki
    type: loki
    access: proxy
    url: http://loki:3100
    jsonData:
      maxLines: 1000
  
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    jsonData:
      exemplarTraceIdDestinations:
        - name: trace_id
          datasourceUid: jaeger
  
  - name: Jaeger
    type: jaeger
    uid: jaeger
    access: proxy
    url: http://jaeger:16686/jaeger
    jsonData:
      nodeGraph:
        enabled: true

Jaeger Integration

Jaeger receives traces via OTLP:
jaeger:
  image: jaegertracing/all-in-one:1.73.0
  ports:
    - "16686:16686"  # Web UI
    - "14250:14250"  # gRPC for collector
  environment:
    - COLLECTOR_OTLP_ENABLED=true
Location: infra/docker-compose.yml:37

Loki Integration

Loki receives logs via OTLP HTTP:
loki:
  image: grafana/loki:main-cadc824
  ports:
    - "3100:3100"
  volumes:
    - ./loki/loki-config.yaml:/etc/loki/local-config.yaml
Location: infra/docker-compose.yml:50

Troubleshooting

Telemetry Not Exporting

Check collector is running:
docker ps | grep otel-collector
curl http://localhost:4317  # Should return HTTP error (expected)
Check gateway logs:
# Look for telemetry initialization messages
grep -i "telemetry" gateway.log
Verify configuration:
cat ~/.enkrypt/enkrypt_mcp_config.json | jq '.plugins.telemetry'

Connection Refused

Error: Telemetry enabled but endpoint localhost:4317 unreachable Solutions:
  1. Start the collector: docker-compose up -d otel-collector
  2. Check firewall rules
  3. Verify endpoint in config matches collector address

No Traces in Jaeger

  1. Check collector exports to Jaeger:
    docker logs otel-collector | grep jaeger
    
  2. Verify Jaeger OTLP is enabled:
    docker logs jaeger | grep OTLP
    
  3. Check trace sampling (if configured)

Metrics Not in Prometheus

  1. Check Prometheus scrape targets:
    http://localhost:9090/targets
    
  2. Verify collector Prometheus endpoint:
    curl http://localhost:8889/metrics
    
  3. Check Prometheus config:
    cat infra/prometheus/prometheus.yml
    

Performance Tuning

Batch Processing

Adjust batch settings in collector config:
processors:
  batch:
    timeout: 1s          # Export every 1 second
    send_batch_size: 1024  # Or when 1024 items collected

Export Interval

Adjust metric export interval:
reader = PeriodicExportingMetricReader(
    otlp_exporter,
    export_interval_millis=10000  # 10 seconds instead of 5
)

Sampling

For high-volume deployments, configure trace sampling:
processors:
  probabilistic_sampler:
    sampling_percentage: 10  # Sample 10% of traces

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, probabilistic_sampler]
      exporters: [otlp]

Advanced Topics

Custom Exporters

Add custom exporters to the collector:
exporters:
  otlphttp/custom:
    endpoint: "https://custom-backend.example.com/v1/traces"
    headers:
      Authorization: "Bearer ${CUSTOM_TOKEN}"

service:
  pipelines:
    traces:
      exporters: [otlp, otlphttp/custom]

TLS Configuration

For production with TLS:
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /path/to/cert.pem
          key_file: /path/to/key.pem

Authentication

Add authentication to OTLP export:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

exporter = OTLPSpanExporter(
    endpoint="collector.example.com:4317",
    headers=(("authorization", "Bearer YOUR_TOKEN"),)
)

Next Steps

Metrics

Explore available metrics and Grafana dashboards

Logging

Configure structured logging and log aggregation

Overview

Return to observability overview

Deployment

Deploy the full observability stack