Jaeger Distributed Tracing with Elasticsearch Backend — Docker Compose Setup and Usage Guide

Overview

This setup deploys Jaeger tracing infrastructure integrated with Elasticsearch as the storage backend using Docker Compose. It enables robust trace data storage, querying, and management, facilitating distributed application performance monitoring and troubleshooting.

Jaeger: All-in-one image serving collector, query, and UI.
Elasticsearch: Persistent storage for Jaeger trace data, configured with disk watermark thresholds and health checks.

Docker Compose Configuration (`docker-compose.yml`)

version: '3.8'
 
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.9.3
    container_name: elasticsearch
    environment:
      - discovery.type=single-node                   # Single node cluster mode
      - ES_JAVA_OPTS=-Xms512m -Xmx512m               # JVM heap size limits for ES
      - xpack.security.enabled=false                  # Disable security for local dev
      - cluster.routing.allocation.disk.threshold_enabled=true
      - cluster.routing.allocation.disk.watermark.low=90%          # Disk watermark low threshold
      - cluster.routing.allocation.disk.watermark.high=95%         # Disk watermark high threshold
      - cluster.routing.allocation.disk.watermark.flood_stage=97%  # Flood stage to prevent writes
      - cluster.info.update.interval=1m               # Info update interval
    ports:
      - "9200:9200"                                   # Elasticsearch REST API
    volumes:
      - esdata:/usr/share/elasticsearch/data          # Persistent data storage
    networks:
      - jaeger-net
    healthcheck:
      test: curl -s http://localhost:9200/_cluster/health | grep -q '"status":"yellow"' || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
 
  jaeger:
    image: jaegertracing/all-in-one:1.67.0
    container_name: jaeger
    environment:
      - SPAN_STORAGE_TYPE=elasticsearch                 # Use Elasticsearch as storage backend
      - ES_SERVER_URLS=http://elasticsearch:9200       # Elasticsearch endpoint for Jaeger
      - COLLECTOR_ZIPKIN_HOST_PORT=:9411                # Zipkin compatible collector port
    ports:
      - "16686:16686"                                    # Jaeger UI
      - "4317:4317"                                      # OTLP gRPC
      - "4318:4318"                                      # OTLP HTTP
      - "14250:14250"                                    # gRPC collector
      - "14268:14268"                                    # HTTP collector
      - "14269:14269"                                    # Admin port
      - "9411:9411"                                      # Zipkin collector
    depends_on:
      elasticsearch:
        condition: service_healthy
    networks:
      - jaeger-net
 
volumes:
  esdata:                                                # Named volume for Elasticsearch data persistence
 
networks:
  jaeger-net:
    driver: bridge                                      # User-defined bridge network for container communication

Setup Instructions

Prerequisites

Docker and Docker Compose installed on the host machine.
At least 2 CPU cores and 4GB RAM recommended for smooth Elasticsearch operation.
Sufficient disk space (preferably >20GB free) for Elasticsearch indices.

Steps

Save the above YAML content as docker-compose.yml in your working directory.
Start the stack:
```
docker-compose up -d
```
Verify services are running:
```
docker ps
```
Access Jaeger UI via browser at: http://localhost:16686

Check Elasticsearch health:

curl -X GET "http://localhost:9200/_cluster/health?pretty"

Managing Traces via Elasticsearch API

Since Jaeger stores spans and traces in Elasticsearch indices, you can query and manage them directly using Elasticsearch REST APIs.

Sample GET request to fetch trace data for a service

curl -X GET "http://localhost:9200/jaeger-span-*/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "process.serviceName": "dynamic-trace-test"
    }
  }
}'

Sample DELETE request to delete trace indices for a specific date

curl -X DELETE "http://localhost:9200/jaeger-service-2025-05-21"

Note: Deleting indices removes all data within those indices irreversibly. Use with caution.

Send Data to Jaeger Collector (Optional)

# nano app-dynamic.trace.py 
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
import time
import random
 
# Define the resource with the service name
resource = Resource(attributes={"service.name": "dynamic-trace-test"})
 
# Set the tracer provider with the specified resource
provider = TracerProvider(resource=resource)
trace.set_tracer_provider(provider)
 
# Configure the OTLP exporter to send data to Jaeger's OTLP/gRPC endpoint
otlp_exporter = OTLPSpanExporter(endpoint="http://127.0.0.1:4317", insecure=True)
 
# Create a batch span processor and add it to the tracer provider
span_processor = BatchSpanProcessor(otlp_exporter)
provider.add_span_processor(span_processor)
 
# Acquire a tracer
tracer = trace.get_tracer(__name__)
 
 
def compute_fibonacci(n):
    with tracer.start_as_current_span("compute_fibonacci") as span:
        span.set_attribute("input.n", n)
        if n <= 1:
            return n
        return compute_fibonacci(n - 1) + compute_fibonacci(n - 2)
 
 
def process_data(data):
    with tracer.start_as_current_span("process_data") as span:
        span.set_attribute("data.length", len(data))
        time.sleep(random.uniform(0.1, 0.3))  # simulate processing delay
        result = [x * 2 for x in data]
        span.set_attribute("result.sum", sum(result))
        return result
 
 
def main():
    with tracer.start_as_current_span("main-operation") as span:
        span.add_event("Starting main operation")
        data = [random.randint(1, 10) for _ in range(5)]
 
        processed = process_data(data)
        span.add_event("Processed data", {"data": str(processed)})
 
        fib = compute_fibonacci(5)
        span.set_attribute("fibonacci.result", fib)
 
        print("Processed:", processed)
        print("Fibonacci(5):", fib)
 
 
if __name__ == "__main__":
    main()

Summary

This setup provides a scalable and manageable tracing backend with:

Persistent Elasticsearch storage with disk watermark management.
Health checks to ensure Elasticsearch readiness before Jaeger startup.
Jaeger UI and API endpoints exposed for trace data ingestion and querying.
Flexibility to manage trace data directly via Elasticsearch REST APIs.

If you require further automation or advanced management (e.g., index lifecycle policies), Elasticsearch’s official documentation and Jaeger’s index cleaner tool can be integrated accordingly.

Trevy Readme