AWS Lambda, Step Functions, DynamoDB, and CDK!
Ever wonder how to invoke the execution of an AWS Step Function from a Lambda? This is exactly what I implemented on a recent project. I’m excited to share what I learned along the way.
In this blog, we will use the Infrastructure as Code AWS CDK to deploy a DynamoDB table, Lambda, and Step Function. By the way, there are other services that can invoke a Step Function. You can read about it here.
The code shown in this blog can be found in my github repository.
What are we going to build
To make this tutorial more realistic, we will build out a solution for the following scenario. Imagine that we have a queue of tasks waiting to be worked on. A function gets called for each task in the queue. As the task moves through a workflow, the status of the task is updated in a database table.
To keep things simple, we will only define 3 steps in the Step Functions state machine. The first step creates a task
item in the Tasks DynamoDB table. The task item contains a Status
attribute initially set to STARTED
. The second step simulates work being performed to complete the task. The last step updates the task item in the DynamoDB table by setting the Status
to DONE
.
Below is a visualization of what we are going to build and deploy.
Build & Deploy
To build and deploy the project to your own AWS account, you will need to first install the following.
- Node.js
- Typescript
- AWS CLI
- AWS CDK
Directions on how to install the above prerequisites can be found here.
Once you have your development environment set up, building and deploying the solution is accomplished by running the following commands.
- Clone the GitHub repository
git clone https://github.com/nickdala/aws-lambda-step-function-dynamodb.git
- Change directory
cd aws-lambda-step-function-dynamodb
- Install npm modules
npm install
-
Compile typescript to js
npm run build
- Deploy this stack to your default AWS account/region
cdk deploy
Let’s now go over each of the components in detail in the next couple of sections.
DynamoDB
The following code in aws-stepfunction-status-stack.ts defines the DynamoDB Tasks
table.
const dynamoTable = new Table(this, 'Tasks', {
partitionKey: {
name: 'taskId',
type: AttributeType.STRING
},
sortKey: {
name: 'timestamp',
type: AttributeType.NUMBER
},
tableName: 'Tasks',
});
The primary key for the Tasks
table is the taskId. The sort key is the Unix time represented as a number. The above code will produce the following DynamoDB table.
Step Function
The step function consists of a combination of the following.
When assembled together in aws-stepfunction-status-stack.ts, the step function looks like the following.
The state machine above contains the following steps.
- Create Dynamo Task Item
A new Task
item is created in the Tasks
DynamoDB table. The Status attribute for that task is initially set to STARTED
.
- Execute long running task…wait 30 seconds
In this task, we simulate a long running task by waiting 30 seconds. The Tasks
table is not updated at this time.
- Update Dynamo Task Item
The task is now complete. The last step is to set the Status for the task to DONE
.
Lambda
The Lambda function is written in Go and is responsible for starting the execution of the Step Function (see main.go). The Lambda function HandleRequest
handler is responsible for processing events. Events are passed to this handler with the following json structure. Remember from our scenario that tasks are pushed to this function.
{
"taskId": <id>
}
When the function is invoked, the Lambda runs the handler method. The handler method prepares the following json to be passed as the iniatal state to the Step Function.
{
"taskId": <id>,
"timestamp": <unix time stamp>
}
Below is the Go code for the Lambda function. The TaskEvent
is passed to HandleRequest
. A StepFunctionInput
object is created using the task id and the current Unix time, i.e. the number of seconds elapsed since January 1, 1970 UTC. The Step Function is started using StartExecution
.
type TaskEvent struct {
TaskId string `json:"taskId"`
}
type StepFunctionInput struct {
TaskId string `json:"taskId"`
Timestamp int64 `json:"timestamp"`
}
func HandleRequest(ctx context.Context, myEvent TaskEvent) (string, error) {
taskId := myEvent.TaskId
unixTime := time.Now().Unix()
initialState := &StepFunctionInput{
TaskId: taskId,
Timestamp: unixTime,
}
...
input := &sfn.StartExecutionInput{
StateMachineArn: &stateMachineArn,
Input: &initialStateAsString,
Name: &stateMachineExcutionName,
}
result, err := client.StartExecution(ctx, input)
...
}
Invoking the Lambda
In order to invoke the Lambda, we first need to open the AWS console in the browser. Navigate to the Lambda service and select the InvokeTaskStepFunction
.
The next step is to open the Test
tab and add the following json in the Test Event
panel.
{
"taskId": "Breakfast"
}
Finally, click on the Test
button. Expand the Execution result
details panel to view the log messages from the Lambda.
The state machine defined in the Step Function has now been invoked asynchronously. Let’s now check the results.
- Verify that the step function has started
- Verify the
Breakfast
task in theTasks
DynamoDB table. TheStatus
attribute is initially set toSTARTED
.
- Go back to the execution of the Step Function and wait for it to complete.
- Once the Step Function completes, let’s do a final check of the
Breakfast
task. Verify that theStatus
attribute is set toDONE
.
Clean up
Don’t forget to clean up the AWS resources by running the following command.
cdk destroy
Summary
Wrapping it all up, we built a serverless solution where a Lambda function starts the execution of a Step Function. We also threw in DynamoDB, a fully managed, serverless, key-value NoSQL database into the mix.