2-Olly
Observability
Eks Grafana Athena S3

EKS + Grafana + Athena + S3: Automate Log Archiving with ILM

Index Lifecycle Management (ILM)Elasticsearch Snapshot APIS3 as backupKibana ILM Policies

Here’s a full step-by-step plan to automate this for all your log indices: uat-*, tmp-*, gen-*.


🧭 Goal

FeatureSetup
Log index formatuat-YYYY.MM.DD, tmp-*, gen-*
Retention in Elasticsearch1 day
Archive to S3After 1 day (daily snapshot)
Delete after archiveYes
Restore manuallyWhen needed

✅ Step 1: Configure ILM Policy (in Kibana or API)

Go to Kibana → Stack Management → Index Lifecycle Policies, or via API:

PUT _ilm/policy/logs-daily-snapshot-delete
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {}
      },
      "delete": {
        "min_age": "1d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

✅ What This Does:

  • Keeps logs 1 day in Elasticsearch
  • Deletes the index after that
  • (But before deletion, we’ll snapshot to S3 — next step)

✅ Step 2: Apply ILM Policy to Index Templates

Create a dynamic index template matching your log streams:

PUT _index_template/logs-template
{
  "index_patterns": ["uat-*", "tmp-*", "gen-*"],
  "template": {
    "settings": {
      "index.lifecycle.name": "logs-daily-snapshot-delete",
      "index.lifecycle.rollover_alias": "logs"
    }
  },
  "priority": 500
}

📌 Now, every log index created (like uat-2025.08.07) will be managed by this ILM policy.


✅ Step 3: Automate Daily Snapshots to S3

Create a daily cronjob (on your server or k8s) to:

  1. Identify yesterday’s indices
  2. Take snapshot to log-archive repo (S3)
  3. Let ILM delete them automatically after 1 day

🛠️ Sample Bash Script (/opt/es-snapshot.sh)

#!/bin/bash
 
ES_URL="http://localhost:9200"
TODAY=$(date +%Y.%m.%d --date="1 day ago")
SNAPSHOT_NAME="snapshot-all-${TODAY}"
 
INDICES=$(curl -s "${ES_URL}/_cat/indices?format=json" | jq -r ".[].index" | grep "$TODAY" | tr '\n' ',' | sed 's/,$//')
 
# Snapshot all matching indices
curl -X PUT "${ES_URL}/_snapshot/log-archive/${SNAPSHOT_NAME}" -H 'Content-Type: application/json' -d"
{
  \"indices\": \"${INDICES}\",
  \"ignore_unavailable\": true,
  \"include_global_state\": false
}
"

Make it executable:

chmod +x /opt/es-snapshot.sh

📅 Add to Cron (daily at 1AM)

0 1 * * * /opt/es-snapshot.sh >> /var/log/es_snapshot.log 2>&1

🔁 Summary Flow

TimeActionTool
Log IngestFluent Bit → ElasticsearchFluent Bit
RetentionKeep logs 1 dayILM Policy
ArchiveSnapshot to S3 before deleteCronjob + S3
DeleteAuto-delete after 1 dayILM Policy
RestoreOn-demand from S3 snapshotSnapshot API

📦 Bonus: Monitor Snapshots

Use this to list recent snapshots:

curl -X GET "http://localhost:9200/_snapshot/log-archive/_all?pretty"

Or set up a Kibana Watcher/Monitor if needed.


Let me know if you want:

  • A Kubernetes CronJob YAML instead of Linux cron
  • A version of the script that logs failures
  • Snapshots every 6 hours instead of daily

You’ve got a solid production pipeline taking shape


💬 Need a Quick Summary?

Hey! Don't have time to read everything? I get it. 😊
Click below and I'll give you the main points and what matters most on this page.
Takes about 5 seconds • Uses Perplexity AI