Get .NET crash dumps from AWS ECS Fargate automatically: leverage Amazon EFS, Amazon S3, AWS DataSync, and AWS Lambda to make debugging easier for your dev teamGet .NET crash dumps from AWS ECS Fargate automatically: leverage Amazon EFS, Amazon S3, AWS DataSync, and AWS Lambda to make debugging easier for your dev team

How To Send .NET Crash Dumps To Slack From ECS Fargate Task

2025/12/15 21:01

Sometimes .NET applications crash in production, and nobody knows why, because logs and metrics are ok. It's quite bothersome and makes debugging very unpleasant. In such cases, memory dumps might simplify debugging and reduce troubleshooting time from days to minutes.

This article explains how to configure dumps for .NET applications deployed to AWS ECS Fargate and then forward them to the development team in the most convenient and secure way.

\

:::tip In this article, we will create AWS resources, and I will refer to AWS documentation in particular situations. IAC won’t be in our focus. Nevertheless, if you enjoy Terraform as much as I do, you can use open-source AWS modules for each article section. From my side, I can recommend you take a look at two AWS Terraform modules projects:

  • https://github.com/cloudposse
  • https://github.com/terraform-aws-modules

:::

\

Solution architecture

It’s time to take a look at our architecture. I'll start by presuming the dev team isn't considering pulling .NET dumps from storage like EBS or EFS due to its complexity. S3 is much simpler for developers to obtain any type of file, and it perfectly suits our expectations. \n

Aside from that, receiving proactive notifications when a new .NET dump is generated would be quite valuable. For example, I'll use Slack, but other options include Teams, Mattermost, WhatsApp, and so on. To send the notification message we will use Lambda and S3 triggers.

\ And the last, but not least important notice. It’s quite complicated to attach an S3 bucket natively to ECS. For that reason we’ll create a middleware layer built on top of EFS, DataSync, and sidecar ECS container / Lambda function. EFS will be used as an intermediate file storage for all our ECS tasks, Datasync will transfer data from EFS to S3 automatically, and a sidecar container or Lambda will clean-up old data from EFS.

\ EFS Dump Lifecycle & Notification Flow

\ Let's quickly review the diagram:

  1. AWS Lambda deletes old EFS files by the schedule configured in EventBridge.

  2. Alternatively, during ECS Task bootstrap phase, sidecar container janitor removes outdated dumps from EFS and quits.

  3. During .NET application crash, a new dump is created at EFS filesystem, and only after that the process is terminated.

  4. DataSync moves data to S3 after a new file is uploaded to EFS.

  5. When an S3 hook detects a newly uploaded file, AWS Lambda is triggered.

  6. AWS Lambda uses IAM to obtain the necessary secrets from AWS Secret Manager.

  7. AWS Lambda sends a message to Slack via API.

\

Step-by-step implementation

Create ECS Fargate task

In this section we need to create an ECS Fargate Task using a sample .NET application.

Prerequisites

Before we proceed, there are a few steps that need to be completed:

  1. Setup ECS cluster via AWS Console, or Terraform.

  2. An official AWS guide: Creating an Amazon ECS cluster for Fargate workloads

  3. Create an IAM execution role for ECS task. To do it, you can follow this AWS guide. In the scope of this article I will use kvendingoldo-dotnet-crash-dump-demo name for IAM execution role.

This minimal Trust policy for execution role will be enough:

{   "Version": "2012-10-17",   "Statement": [     {       "Effect": "Allow",       "Principal": {         "Service": "ecs-tasks.amazonaws.com"       },       "Action": "sts:AssumeRole"     }   ] }

As well as minimal permissions policy:

{   "Version": "2012-10-17",   "Statement": [     {       "Effect": "Allow",       "Action": [         "ecr:GetAuthorizationToken",         "ecr:BatchCheckLayerAvailability",         "ecr:GetDownloadUrlForLayer",         "ecr:BatchGetImage",         "logs:CreateLogStream",         "logs:PutLogEvents"       ],       "Resource": "*"     }   ] }

\

Create task definition

Once all prerequisites are ready, it’s time to create a minimal Fargate task with a sample .NET app. To do it, follow official AWS guide, and use this task definition json file:

{  "containerDefinitions": [    {      "cpu": 0,      "essential": true,      "image": "mcr.microsoft.com/dotnet/samples:aspnetapp",      "mountPoints": [],      "name": "app",      "portMappings": [        {          "containerPort": 8000,          "hostPort": 8000,          "protocol": "tcp"        }      ],      "systemControls": [],      "volumesFrom": []    }  ],  "cpu": "256",  "executionRoleArn": "kvendingoldo-dotnet-crash-dump-demo",  "family": "kvendingoldo-dotnet-crash-dump-demo",  "memory": "512",  "networkMode": "awsvpc",  "placementConstraints": [],  "requiresCompatibilities": ["FARGATE"],  "volumes": [],  "tags": [] }

\

Configure .NET dumps

By default, .NET apps do not generate any dumps. To configure it, we must set the following environment variables:

\

# Forces the runtime to generate a stack dump on unhandled exceptions. COMPlus_StackDumpOnUnhandledException=1 # Enable mini dump generation on crash COMPlus_DbgEnableMiniDump=1 # Choose dump type:  # 1 = Mini,  # 2 = Full (use carefully) # 4 = Triage (includes stack, threads, and some heap info — a good balance for debugging). COMPlus_DbgMiniDumpType=2 # Target path for dump file (EFS is mounted here) COMPlus_DbgMiniDumpName=/dumps/dump-%e-%p-%t.dmp

These variables can be added directly to the Dockerfile or defined as environment variables in the ECS Task Definition json.

In our example, let's inject them into the ECS task specification. To accomplish this, we'll add them to the containerDefinitions[0].environment, as shown below:

\

"environment": [    {        "name": "COMPlus_StackDumpOnUnhandledException",        "value": "1"    },    {        "name": "COMPlus_DbgMiniDumpType",        "value": "4"    },    {        "name": "COMPlus_DbgEnableMiniDump",        "value": "1"    },    {        "name": "COMPlus_DbgMiniDumpName",        "value": "/dumps/%t-kvendingoldo-dotnet-demo-crash.dmp"    } ]

\

:::tip As you can see, I use a few placeholders in COMPlus_DbgMiniDumpName. Dotnet automatically expands the following placeholders in the dump file name:

  • %e - executable name
  • %p - process ID
  • %t - timestamp

\ See these two links for further information on collecting and analyzing .NET crash dumps: 

  • Collect .NET Crash Dumps (Microsoft Learn)
  • Debugging .NET Core memory issues (on Linux) with dotnet dump

:::

\

Create EFS storage and mount it to the ECS Fargate Task

As I mentioned at the beginning of this article, attaching an S3 bucket to an ECS job is quite difficult; instead, we will use Amazon EFS (Elastic File System) as intermediate storage for.NET dump files, which can be easily mounted to a set of ECS tasks.

:::tip To create EFS storage, follow the official AWS guide: Amazon ECS Tutorial: Using Amazon EFS File Systems

:::

There’s nothing special to add to the official documentation. Just make sure that:

  • EFS and ECS Cluster are in the same VPC
  • EFS can be accessed by ECS tasks over NFS (port 2049/tcp). Allow inbound access to NFS ports in the EFS security group to do this.

To mount EFS filesystem into the ECS task we must grant the necessary permissions to the kvendingoldo-dotnet-crash-dump-demo IAM role (pay attention to placeholders):

\

{   "Version": "2012-10-17",   "Statement": [     {       "Sid": "AllowEFSAccess",       "Effect": "Allow",       "Action": [         "elasticfilesystem:ClientMount",         "elasticfilesystem:ClientWrite",         "elasticfilesystem:ClientRootAccess"       ],       "Resource": "arn:aws:elasticfilesystem:<region>:<account-id>:file-system/<filesystem-id>"     }   ] }

As a final step, define the EFS volumes and mount points in your ECS task definition (change fileSystemId fs-xxxxxx with your real File System Id after bootstrapping):

\

"volumes": [   {     "name": "dotnet-dumps",     "efsVolumeConfiguration": {       "fileSystemId": "fs-xxxxxx",       "rootDirectory": "/"     }   } ]

\

"mountPoints": [   {     "containerPath": "/dumps",     "readOnly": false,     "sourceVolume": "dotnet-dumps"   } ]

Configure AWS DataSync to transfer EFS files to S3

DataSync service is a standard AWS tool for transferring data between various types of storage. In our case, it will assist us move.NET dumps from EFS to S3.

To reach our goal, we have to:

  • Create an S3 bucket to store our.NET dumps. Further in this article I’ll use S3 bucket name kvendingoldo-dotnet-demo-crash
  • Use this official doc to create a bucket.
  • Create DataSync
  • Use this official doc to create DataSync.
  • Some service parameters I'll be using:
    • Source: EFS
    • Destination: S3 bucket (e.g., s3://kvendingoldo-dotnet-demo-crash/)
    • Include path filters like /dumps/*
    • Schedule sync every minute \n

Create slack alerts based on AWS Lambda

As was earlier said, alerts about new.NET dumps are extremely helpful for the development team.

From the architecture viewpoint, alerts can be built in a different of ways:

  1. A simple lambda function that sends messages to Slack via API and triggered by S3 events.
  2. Messages are published to an SNS topic using configured S3 event notifications, which then trigger a Lambda function to send the events to Slack.

Since we don't expect a high load, the first option is better for us. In case, if you want to implement the second option use these two links:

  • Terraform module for deploying SNS and Lambda stack
  • A guide for configuring S3 events to SNS

\

:::tip We use Python to send messages into Slack. In this article we’ll send only a link to the S3 file, but in some cases it’s required to send the entire file. Slack API has changed some time ago, and file sending can be a little bit complicated. If you want to know more,  please see the “Uploading files to Slack with Python” article.

:::

\ Ok, let’s build the alerting step by step:

1. Create Slack secret

Create AWS Secret Manager secret kvendingoldo-dotnet-crash-dump-demo with one field: slack_webhook_url. This key should contain a link to your Slack webhook (to learn more about Slack webhook check the official guide).

2. Configure AWS Lambda

We won't go into depth about the creation of AWS Lambda, but we will highlight some key points. To get more fundamental information about AWS Lambda setup, see the official guide.

2.1. Make sure that the Lambda IAM role has permission to read from S3:

{   "Effect": "Allow",   "Action": "s3:GetObject",   "Resource": "arn:aws:s3:::kvendingoldo-dotnet-demo-crash/*" }

2.2: To get a data from AWS Secret manager we have to specify environment variable in AWS Lambda configuration: SECRET_NAME=kvendingoldo-dotnet-demo-crash

2.3: Upload Python code to Lambda

import json import urllib3 import os import boto3 def get_secret(secret_name):    client = boto3.client("secretsmanager")    try:        response = client.get_secret_value(SecretId=secret_name)        if "SecretString" in response:            secret = response["SecretString"]            try:                return json.loads(secret)            except json.JSONDecodeError:                return secret        else:            return response["SecretBinary"]    except Exception as e:        print(f"Error retrieving secret: {e}")        return None def lambda_handler(event, context):    print("Event received:", json.dumps(event))    secret_name = os.environ.get('SECRET_NAME', '')    if secret_name == "":        return {            'statusCode': 500,            'body': json.dumps("SECRET_NAME env variable is empty")        }    secret = get_secret(secret_name)    slack_webhook_url = secret["slack_webhook_url"]    for record in event['Records']:        bucket_name = record['s3']['bucket']['name']        file_name = record['s3']['object']['key']        region = record['awsRegion']        if ".aws" in file_name:            print(f"Skipping internal file: {file_name}")            continue        message = (            f":package: *New .NET dump is uploaded!*\n\n"            f":cloud: Bucket: `{bucket_name}`\n"            f":floppy_disk: File: `{file_name}`\n"            f":link: Link: https://{bucket_name}.s3.{region}.amazonaws.com/{file_name}"        )        http = urllib3.PoolManager()        slack_resp = http.request(            "POST",            slack_webhook_url,            body=json.dumps({                "text": message            }),            headers={                "Content-Type": "application/json"            }        )        if slack_resp.status != 200:            raise Exception(                f"Slack webhook request failed with status {slack_resp.status}: {slack_resp.data.decode('utf-8')}")    return {        "statusCode": 200,        "body": json.dumps("Message has been sent successfully!")    }

2.4: Configure S3 Event Notifications for your S3 bucket. To do this, go to the bucket -> properties -> Event notifications and select "Create event notification". Configure the event using the following options:

  • Event name: kvendingoldo-dotnet-demo-crash

  • Prefix: dumps/

  • Event type: s3:ObjectCreated:*

  • Target: <Your Lambda function Name>

\

Configure EFS storage clean-up

Perfect, the .NET dumps delivery chain is ready, but what’s about the old dump? EFS does not allow us to delete old files using lifecycle policies; we can only transfer them to Infrequent Access storage type which is not enough if we do not want to pay for unnecessary space.

To solve this issue, there are two options:

  1. Create ECS sidecar container that will clean-up old EFS files at the initialization phase
  2. Create Lambda or ECS task that will mount EFS, and clean-up old files by CRON.

Let’s check both of them.

Option 1: AWS Lambda

This is the best solution because it is unaffected by the lifecycle of ECS tasks and other factors. To implement this strategy, you need to create a Lambda function with a mounted EFS storage (learn more about mounting a filesystem to Lambda from the official doc) and the following Python code:

import os import time import json def lambda_handler(event, context):    # Note: you can only mount the filesystem to the /mnt/ directory.    directory = '/mnt/dumps'    # File pattern to match    pattern = 'crash.dmp'    # Time in minutes (by default 1d)    minutes_old = 1440    # Convert minutes to seconds    age_seconds = minutes_old * 60    # Current time    now = time.time()    for root, dirs, files in os.walk(directory):        for file in files:            if pattern in file:                file_path = os.path.join(root, file)                file_mtime = os.path.getmtime(file_path)                if now - file_mtime > age_seconds:                    print(f"Found a file that older than {minutes_old} minutes: {file_path}")                    try:                        os.remove(file_path)                    except Exception as e:                        print(f"Failed to delete {file_path}: {e}")    return {        "statusCode": 200,        "body": json.dumps("EFS clean-up completed successfully!")    }

As you can see, this is a simple code that deletes files from mounted storage that are older than one day.

\n When your Lambda is ready we also need to  configure the CRON trigger to run the function periodically. It can be created usingcloudwatch event rule.

That’s it, after all of these steps your EFS storage will be cleaned up automatically by your CRON schedule.

\

Option 2: ECS sidecar container.

To implement this option we have to to add new container to our task definition:

{  "essential": false,  "name": "janitor",  "image": "public.ecr.aws/amazonlinux/amazonlinux:2",  "command": [    "bash",    "-lc",    "find /dumps -name '*crash.dmp*' -type f -mmin +10080 -print -delete"  ],  "mountPoints": [    {      "containerPath": "/dumps",      "readOnly": false,      "sourceVolume": "dotnet-dumps"    }  ],  "linuxParameters": {    "initProcessEnabled": true  } }

The logic behind this task:

  • Initialize a new ECS task with two containers: app and janitor
  • Clean up outdated EFS files in the janitor container and exit. Regardless, the task will not be interrupted or stopped due to ECS option "essential": false.

As you can see, this technique is quite straightforward and relies on find command, which you can customize. In the example, it deletes files that are older than 10080 minutes (7 days). Of course, this strategy is less desirable than the first when dealing with long-lived ECS tasks, but it may be more convenient for short-lived ECS tasks or prototyping.

\

Testing time

In this section, we won't do a deep dive into the.NET application build. For testing purposes, you can modify the sample aspnetapp that we used in the beginning.

\n The simplest way to cause a crash of .NET isEnvironment.FailFast(). This method is commonly used to simulate hard crashes.

\ Let’s simulate the crash:

  1. Add Environment.FailFast("kvendingoldo-dotnet-demo-crash .NET example crash"); line to dotnet-docker/samples/aspnetapp/aspnetapp/Program.cs file.
  2. Build a new docker image, and re-create the ECS task.
  3. ECS Task will terminate, but first generate a.NET crash dump, which will be available on S3 in a few seconds.
  4. At the final phase, you'll receive a message on your Slack like this:

📦 New .NET dump is uploaded! ☁️ Bucket: kvendingoldo-dotnet-demo-crash 💾 File: 1739104252-kvendingoldo-dotnet-demo-crash.dmp 🔗 Link: https://kvendingoldo-dotnet-demo-crash.s3.us-east-2.amazonaws.com/1739104252-kvendingoldo-dotnet-demo-crash.dmp

\

Possible improvements

Before wrapping-up the article, I'd like to provide some comments on potential changes:

  1. It will be a good idea to generate pre-signed URLs for S3 objects
  2. Set lifecycle policies for S3 bucket to delete old dumps automatically from the bucket
  3. Use SNS to send notifications about new S3 objects to multiple destinations

Conclusion

In production environments, quick visibility into faults is critical. Automating dump delivery reduces MTTR (Mean Time To Resolution) and improves incident response. \n

As you can see, implementing this procedure is not as difficult as you might expect. Yes, we used many AWS services to accomplish these tasks, but when we look deeper, they are all important.

\ I hope this article helped you build a personal dump delivery chain and made your development team happier.

Feel free to modify the proposed approach, and please contact me anytime if you have any questions.

Happy coding!

Market Opportunity
Suilend Logo
Suilend Price(SEND)
$0.2248
$0.2248$0.2248
-0.83%
USD
Suilend (SEND) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Volante Technologies Customers Successfully Navigate Critical Regulatory Deadlines for EU SEPA Instant and Global SWIFT Cross-Border Payments

Volante Technologies Customers Successfully Navigate Critical Regulatory Deadlines for EU SEPA Instant and Global SWIFT Cross-Border Payments

PaaS leader ensures seamless migrations and uninterrupted payment operations LONDON–(BUSINESS WIRE)–Volante Technologies, the global leader in Payments as a Service
Share
AI Journal2025/12/16 17:16
Fed Acts on Economic Signals with Rate Cut

Fed Acts on Economic Signals with Rate Cut

In a significant pivot, the Federal Reserve reduced its benchmark interest rate following a prolonged ten-month hiatus. This decision, reflecting a strategic response to the current economic climate, has captured attention across financial sectors, with both market participants and policymakers keenly evaluating its potential impact.Continue Reading:Fed Acts on Economic Signals with Rate Cut
Share
Coinstats2025/09/18 02:28
Google's AP2 protocol has been released. Does encrypted AI still have a chance?

Google's AP2 protocol has been released. Does encrypted AI still have a chance?

Following the MCP and A2A protocols, the AI Agent market has seen another blockbuster arrival: the Agent Payments Protocol (AP2), developed by Google. This will clearly further enhance AI Agents' autonomous multi-tasking capabilities, but the unfortunate reality is that it has little to do with web3AI. Let's take a closer look: What problem does AP2 solve? Simply put, the MCP protocol is like a universal hook, enabling AI agents to connect to various external tools and data sources; A2A is a team collaboration communication protocol that allows multiple AI agents to cooperate with each other to complete complex tasks; AP2 completes the last piece of the puzzle - payment capability. In other words, MCP opens up connectivity, A2A promotes collaboration efficiency, and AP2 achieves value exchange. The arrival of AP2 truly injects "soul" into the autonomous collaboration and task execution of Multi-Agents. Imagine AI Agents connecting Qunar, Meituan, and Didi to complete the booking of flights, hotels, and car rentals, but then getting stuck at the point of "self-payment." What's the point of all that multitasking? So, remember this: AP2 is an extension of MCP+A2A, solving the last mile problem of AI Agent automated execution. What are the technical highlights of AP2? The core innovation of AP2 is the Mandates mechanism, which is divided into real-time authorization mode and delegated authorization mode. Real-time authorization is easy to understand. The AI Agent finds the product and shows it to you. The operation can only be performed after the user signs. Delegated authorization requires the user to set rules in advance, such as only buying the iPhone 17 when the price drops to 5,000. The AI Agent monitors the trigger conditions and executes automatically. The implementation logic is cryptographically signed using Verifiable Credentials (VCs). Users can set complex commission conditions, including price ranges, time limits, and payment method priorities, forming a tamper-proof digital contract. Once signed, the AI Agent executes according to the conditions, with VCs ensuring auditability and security at every step. Of particular note is the "A2A x402" extension, a technical component developed by Google specifically for crypto payments, developed in collaboration with Coinbase and the Ethereum Foundation. This extension enables AI Agents to seamlessly process stablecoins, ETH, and other blockchain assets, supporting native payment scenarios within the Web3 ecosystem. What kind of imagination space can AP2 bring? After analyzing the technical principles, do you think that's it? Yes, in fact, the AP2 is boring when it is disassembled alone. Its real charm lies in connecting and opening up the "MCP+A2A+AP2" technology stack, completely opening up the complete link of AI Agent's autonomous analysis+execution+payment. From now on, AI Agents can open up many application scenarios. For example, AI Agents for stock investment and financial management can help us monitor the market 24/7 and conduct independent transactions. Enterprise procurement AI Agents can automatically replenish and renew without human intervention. AP2's complementary payment capabilities will further expand the penetration of the Agent-to-Agent economy into more scenarios. Google obviously understands that after the technical framework is established, the ecological implementation must be relied upon, so it has brought in more than 60 partners to develop it, almost covering the entire payment and business ecosystem. Interestingly, it also involves major Crypto players such as Ethereum, Coinbase, MetaMask, and Sui. Combined with the current trend of currency and stock integration, the imagination space has been doubled. Is web3 AI really dead? Not entirely. Google's AP2 looks complete, but it only achieves technical compatibility with Crypto payments. It can only be regarded as an extension of the traditional authorization framework and belongs to the category of automated execution. There is a "paradigm" difference between it and the autonomous asset management pursued by pure Crypto native solutions. The Crypto-native solutions under exploration are taking the "decentralized custody + on-chain verification" route, including AI Agent autonomous asset management, AI Agent autonomous transactions (DeFAI), AI Agent digital identity and on-chain reputation system (ERC-8004...), AI Agent on-chain governance DAO framework, AI Agent NPC and digital avatars, and many other interesting and fun directions. Ultimately, once users get used to AI Agent payments in traditional fields, their acceptance of AI Agents autonomously owning digital assets will also increase. And for those scenarios that AP2 cannot reach, such as anonymous transactions, censorship-resistant payments, and decentralized asset management, there will always be a time for crypto-native solutions to show their strength? The two are more likely to be complementary rather than competitive, but to be honest, the key technological advancements behind AI Agents currently all come from web2AI, and web3AI still needs to keep up the good work!
Share
PANews2025/09/18 07:00