You’re going to build a complete alerting pipeline from scratch—an SNS topic, an email subscription, and two CloudWatch alarms that watch your Lambda function’s error rate and duration. Then you’ll intentionally trigger the alarms to prove the pipeline works end to end.
If you want AWS’s version of the alarm flow open while you work, keep the CloudWatch overview nearby. It is the fastest way to sanity-check console labels and alarm terminology when the UI drifts.
This is the same monitoring setup you’d use for any production Lambda function. Get it working here, and you can replicate it for every function you deploy.
Why It Matters
Dashboards require you to look at them. Logs require you to search them. Neither one wakes you up when something breaks at 2 AM. Alarms are the piece that closes the loop—they watch your metrics continuously and notify you the moment something crosses a threshold. Without alarms, the gap between “something broke” and “someone noticed” is however long it takes a user to file a complaint. I’ve been on both sides of that gap, and the alarm side is definitively better.
Your Task
Set up a monitoring pipeline for my-frontend-app-api that:
- Sends email notifications through an SNS topic
- Fires an alarm when the Lambda error count exceeds 3 in a 5-minute period
- Fires an alarm when the average Lambda duration exceeds 2 seconds
- Can be tested by intentionally triggering each alarm
Use the account ID 123456789012 and region us-east-1.
Create an SNS Topic
Create an SNS topic named my-frontend-app-alerts that will serve as the notification channel for your alarms.
Checkpoint
aws sns list-topics --region us-east-1 --output json shows a topic with the ARN arn:aws:sns:us-east-1:123456789012:my-frontend-app-alerts.
Subscribe Your Email
Subscribe your email address to the topic. After running the subscribe command, check your inbox for the confirmation email and click the confirmation link.
Checkpoint
aws sns list-subscriptions-by-topic --topic-arn arn:aws:sns:us-east-1:123456789012:my-frontend-app-alerts --region us-east-1 --output json shows your subscription with a real ARN (not pending confirmation).
Create the Error Count Alarm
Create a CloudWatch alarm named my-frontend-app-api-error-count that:
- Watches the
Errorsmetric in theAWS/Lambdanamespace - Uses the
FunctionNamedimension set tomy-frontend-app-api - Uses the
Sumstatistic - Fires when the error count is greater than 3 in a single 5-minute period (300 seconds)
- Requires 2 consecutive evaluation periods before triggering
- Sends notifications to your SNS topic on both
ALARMandOKtransitions
Think about which --comparison-operator you need. “Greater than 3” isn’t the same as “greater than or equal to 3.”
Checkpoint
aws cloudwatch describe-alarms --alarm-names my-frontend-app-api-error-count --region us-east-1 --output json returns the alarm with:
StateValueofINSUFFICIENT_DATA(normal for a new alarm)Thresholdof3.0ComparisonOperatorofGreaterThanThresholdPeriodof300EvaluationPeriodsof2
Create the Duration Alarm
Create a CloudWatch alarm named my-frontend-app-api-high-duration that:
- Watches the
Durationmetric in theAWS/Lambdanamespace - Uses the
FunctionNamedimension set tomy-frontend-app-api - Uses the
Averagestatistic - Fires when the average duration is greater than 2000 milliseconds
- Uses a period of 300 seconds and 2 evaluation periods
- Sends notifications to your SNS topic when transitioning to
ALARM
Remember: Lambda Duration is measured in milliseconds.
Checkpoint
aws cloudwatch describe-alarms --alarm-names my-frontend-app-api-high-duration --region us-east-1 --output json returns the alarm with:
MetricNameofDurationStatisticofAverageThresholdof2000.0
Test the Error Alarm
You can test the notification pipeline without waiting for real errors by manually setting the alarm state:
- Use
aws cloudwatch set-alarm-stateto forcemy-frontend-app-api-error-countinto theALARMstate with a reason of"Testing alarm notification pipeline" - Check your email for the SNS notification
- Wait a few minutes—the alarm will return to its actual state on the next evaluation
Checkpoint
- You received an email from SNS with the alarm details
- The email includes the alarm name, reason, and the state transition
Trigger a Real Alarm (Optional)
If you want to see the alarm fire from actual metrics instead of a manual state change, you can intentionally break your Lambda function:
- Update your Lambda function code to throw an error on every invocation
- Invoke the function more than 3 times using the CLI
- Wait for two 5-minute evaluation periods (up to 10 minutes)
- Check your email for the alarm notification
To create a failing function, deploy a handler that always throws:
import type { APIGatewayProxyHandlerV2 } from 'aws-lambda';
export const handler: APIGatewayProxyHandlerV2 = async () => {
throw new Error('Intentional error to trigger CloudWatch alarm');
};After verifying the alarm fires, redeploy your working handler. Don’t leave a broken function deployed.
Checkpoint
- The alarm transitioned from
INSUFFICIENT_DATA(orOK) toALARM - You received an SNS email notification
- After redeploying the working handler, the alarm returns to
OK(this may take up to 10 minutes)
List and Verify All Alarms
List all your alarms with the my-frontend-app prefix and confirm both are configured correctly:
aws cloudwatch describe-alarms \
--alarm-name-prefix my-frontend-app \
--region us-east-1 \
--output jsonCheckpoint
Two alarms are listed, both pointing to the same SNS topic, with the correct metrics, thresholds, and evaluation periods.
Checkpoints Summary
- SNS topic
my-frontend-app-alertsexists - Email subscription is confirmed (not
pending confirmation) - Error count alarm exists with threshold of 3, Sum statistic, period of 300s, 2 evaluation periods
- Duration alarm exists with threshold of 2000ms, Average statistic, period of 300s, 2 evaluation periods
- Manual alarm state test produced an email notification
- Both alarms show in
describe-alarmswith correct configuration
Failure Diagnosis
- The alarm exists but you never receive email: The SNS subscription is still
PendingConfirmation, or the email landed in spam and was never confirmed. - The alarm never moves out of
INSUFFICIENT_DATA: The metric namespace, dimensions, or statistic do not match the resource that is actually emitting data. - The alarm stays in
ALARMafter you fix the problem: CloudWatch waits for the next evaluation windows. Give it enough time to see fresh healthy datapoints before assuming the configuration is wrong.
Stretch Goals
Add a 5XX alarm for API Gateway. Create a third alarm that watches the
5XXErrormetric in theAWS/ApiGatewaynamespace. Use a threshold of 0 and a single evaluation period—any server error should trigger immediately.Add an OK action to the duration alarm. Update the duration alarm to also notify you when it transitions back to
OK. Useaws cloudwatch put-metric-alarmwith the same parameters plus--ok-actions.Check alarm history. Run
aws cloudwatch describe-alarm-history --alarm-name my-frontend-app-api-error-count --region us-east-1 --output jsonto see the history of state transitions for your alarm.
When you’re ready, check your work against the Solution: Set Up Alarms for Your Lambda Functions.