EBS Snapshot Cleanup — Interactive Trainer

1. Business Scenario

Your company runs many EC2 instances for finance, warehouse automation, and real-time tracking. Nightly EBS snapshots protect data, but after months you are left with thousands of snapshots and a growing AWS bill.

You need an automated Lambda solution that keeps the latest two snapshots per volume and safely removes the older ones.

2. Cleanup Flow

Step 1 — Trigger
EventBridge runs the Lambda function daily at 2 AM UTC.
Step 2 — Discover Volumes
Use describe_volumes() to list all EBS volumes in the region.
Step 3 — List Snapshots
For each volume, call describe_snapshots() filtered by volume-id.
Step 4 — Guardrail
If a volume has 0–2 snapshots, skip deletion for safety.
Step 5 — Sort
Sort snapshots by StartTime (newest first).
Step 6 — Delete Old
Keep the first two snapshots and delete all others with delete_snapshot().
Step 7 — Summary
Log how many snapshots were deleted and return a JSON result.

Use this sequence as a storyboard for your CapCut animation or teaching slides.

  • Show EC2 + EBS icons with “Too many snapshots” bubble.
  • Zoom into Lambda icon with a daily clock icon.
  • Animate a loop moving across volumes and snapshots.
  • Old snapshots fade out; cost meter goes down.

3. Lambda Code (Python / Node.js / AWS CLI)

import boto3
import os
from botocore.exceptions import ClientError


def lambda_handler(event, context):
    """
    Lambda function to manage EBS snapshots.
    Keeps only the two most recent snapshots per volume.
    Can be triggered daily via EventBridge or manually.
    """

    region = os.environ.get("AWS_REGION", "ap-southeast-1")
    ec2 = boto3.client("ec2", region_name=region)

    print(f"📦 Starting snapshot cleanup in region: {region}")

    # 1️⃣ Get all EBS volumes
    try:
        volumes = ec2.describe_volumes()["Volumes"]
        print(f"✅ Found {len(volumes)} volume(s) to process.")
    except ClientError as e:
        print(f"❌ Unable to describe volumes: {e}")
        return {"status": "error", "message": str(e)}

    deleted_snapshots = []

    # 2️⃣ For each volume
    for vol in volumes:
        vol_id = vol["VolumeId"]
        print(f"\n🔍 Processing Volume: {vol_id}")

        # 3️⃣ Get snapshots for this volume
        try:
            snapshots = ec2.describe_snapshots(
                Filters=[{"Name": "volume-id", "Values": [vol_id]}],
                OwnerIds=["self"],
            )["Snapshots"]
        except ClientError as e:
            print(f"   ⚠️ Could not list snapshots for {vol_id}: {e}")
            continue

        # 4️⃣ If snapshots <= 2: skip
        if len(snapshots) <= 2:
            print(f"   ✅ Only {len(snapshots)} snapshot(s) found — no deletion needed.")
            continue

        # 5️⃣ Sort snapshots by StartTime (newest first)
        snapshots.sort(key=lambda s: s["StartTime"], reverse=True)

        # 6️⃣ Delete all but the two newest snapshots
        for snap in snapshots[2:]:
            snap_id = snap["SnapshotId"]
            try:
                ec2.delete_snapshot(SnapshotId=snap_id)
                deleted_snapshots.append(snap_id)
                print(f"   🗑️ Deleted snapshot: {snap_id}")
            except ClientError as e:
                print(f"   ❌ Error deleting {snap_id}: {e}")

    # 7️⃣ Return summary
    print(f"\n✅ Cleanup complete — {len(deleted_snapshots)} old snapshots deleted.")

    return {
        "status": "success",
        "deletedSnapshotsCount": len(deleted_snapshots),
        "deletedSnapshots": deleted_snapshots,
    }

// Node.js Lambda version of snapshot cleanup
const AWS = require("aws-sdk");
const ec2 = new AWS.EC2();

exports.handler = async () => {
  const volumes = (await ec2.describeVolumes().promise()).Volumes;
  const deleted = [];

  for (const vol of volumes) {
    const snaps = (await ec2.describeSnapshots({
      Filters: [{ Name: "volume-id", Values: [vol.VolumeId] }],
      OwnerIds: ["self"]
    }).promise()).Snapshots;

    if (snaps.length <= 2) continue;

    snaps.sort((a, b) => new Date(b.StartTime) - new Date(a.StartTime));

    for (const snap of snaps.slice(2)) {
      await ec2.deleteSnapshot({ SnapshotId: snap.SnapshotId }).promise();
      deleted.push(snap.SnapshotId);
    }
  }

  return { deletedSnapshots: deleted };
};
          

# List all EBS volumes
aws ec2 describe-volumes --query "Volumes[].VolumeId"

# List snapshots for a given volume
aws ec2 describe-snapshots   --owner-ids self   --filters "Name=volume-id,Values=VOL_ID"

# Delete one snapshot by ID
aws ec2 delete-snapshot --snapshot-id SNAP_ID
          

4. Voice Scripts (CapCut Ready)


🎙 English Voice Script — CapCut Ready

Imagine your finance, warehouse, and tracking systems all running on EC2.
Every night, EBS snapshots are taken for backup.
After a few months, thousands of snapshots pile up and your AWS bill spikes.

In this short clip, we show how a simple Lambda function can keep your backups clean.

Step one: EventBridge triggers the Lambda every night.
Step two: Lambda lists all EBS volumes and finds their snapshots.
Step three: for each volume, it sorts snapshots by creation time.
Step four: it keeps the two newest snapshots, and safely deletes the older ones.

The code uses boto3 calls like describe_volumes, describe_snapshots, and delete_snapshot.
A few lines of Python turn a manual, risky process into a repeatable, governed policy.

With this in place, your team keeps reliable restore points,
reduces storage cost, and never has to manually clean snapshots again.
          

🎙 中文语音脚本 — 适合配音与教学

想像一下,你的财务系统、仓储系统、物流追踪系统,都跑在 EC2 上。
每天晚上,都会自动创建 EBS 快照做备份。
几个月之后,你的账户里已经堆满了成千上万个快照,
而 AWS 账单也悄悄地一路往上爬。

这一段短视频,要展示的就是:
如何用一个简单的 Lambda 函数,把快照清理自动化。

第一步,EventBridge 每天定时触发 Lambda。
第二步,Lambda 使用 describe_volumes 获取所有磁盘,
再用 describe_snapshots 找到每个磁盘相关的快照。
第三步,按创建时间排序,最新的在前面。
第四步,只保留最新的两个快照,其余的旧快照用 delete_snapshot 安全删除。

几行 Python 代码,就把原本又累又容易出错的人工操作,
变成一个固定、可审计、可复用的自动化策略。

这样一来,你既保留了可靠的恢复点,
又大幅降低存储成本,
团队再也不用手动对着一长串 snapshot 发愁了。