To run a function on AWS Lambda you create a deployment package that contains the function code and its dependencies. In this post, I show how to use webpack to ensure that the deployment package for a Node.js Lambda function is as small as possible.
Naive packaging
The simplest way to create a deployment package is to ZIP the function code alongside its runtime dependencies. The directory structure to package for a typical Node.js Lambda function looks like:
my-function
├── dist
│ └── index.js
└── node_modules
├── aws-sdk
├── pg
...
Here, the node_modules
directory contains the dependencies installed by a
command like npm ci --only=production
and the dist
directory contains the
function code built for distribution, perhaps transpiled by Babel.
To create the deployment package, enter the my-function
directory and run:
$ zip -r my-function.zip dist node_modules
This naive packaging approach works, but the deployment package is much larger than necessary. The package contains the full contents of all the dependencies including assets, documentation, tests, and TypeScript files.
Lambda limits
AWS Lambda function deployment packages are limited to 250 MB unzipped, including layers. This limit provides motivation to minimize its size.
Ideally, the deployment package contains only:
- the function code
- the dependency modules used by the function code
We can create such a deployment package using a bundler like webpack.
Bundling with webpack
Webpack enables creating a bundle that contains only the modules necessary to run the function. This section introduces an example Lambda function and walks through the steps to create a minimal deployment package using webpack.
Example function
Here’s a contrived Node.js v12 function that:
- Accepts a dog’s name as input.
- Queries a PostgreSQL database for the IDs of dogs with that name.
- Queries a DynamoDB table to get the dogs’ bark counts: the number of times each dog has barked.
The function requires two libraries: node-postgres and aws-sdk.
Note that the PGDATABASE
, PGHOST
, PGUSER
, PGPASSWORD
, and PGPORT
environment variables configure access to the SQL database using
node-postgres. DYNAMODB_TABLE_NAME
and DYNAMODB_TABLE_HASH_KEY
configure
the DynamoDB table information; the ID is the table’s hash key.
index.js:
import { Client } from 'pg';
import DynamoDB from 'aws-sdk/clients/dynamodb';
const DYNAMODB_TABLE_NAME = process.env.DYNAMODB_TABLE_NAME;
const DYNAMODB_TABLE_HASH_KEY = process.env.DYNAMODB_TABLE_HASH_KEY;
/**
* Query the PostgreSQL database for the IDs of dogs with the specified name.
*
* @param {string} name The name.
* @return Array containing the IDs of dogs with the specified name.
*/
const getDogId = async (name) => {
const client = new Client();
await client.connect();
const query = 'SELECT id FROM dogs WHERE name=$1';
const values = [name];
const queryResult = await client.query(query, values);
await client.end();
let ids = [];
if (queryResult.rowCount > 0) {
ids = queryResult.rows.map((row) => row.id);
}
return ids;
};
/**
* Query the DynamoDB table for each dog's bark count.
*
* @param {number[]} ids The dogs' IDs.
* @return The bark count for each ID.
*/
const getBarkCount = async (id) => {
let results;
if (ids.length > 0) {
const config = {
region: 'us-east-1',
};
const documentClient = new DynamoDB.DocumentClient(config);
const keys = ids.map((id) => {
return {
[DYNAMODB_TABLE_HASH_KEY]: id,
};
});
const params = {
RequestItems: {
[DYNAMODB_TABLE_NAME]: {
Keys: keys,
ProjectionExpression: 'barkCount',
},
},
};
const data = await documentClient.batchGet(params).promise();
results = data.Responses[DYNAMODB_TABLE_NAME];
}
return ids.map((id) => {
return {
id,
barkCount: results[id] ? results[id].barkCount : 0,
};
});
};
/**
* Lambda function handler to retrieve how many times dogs have barked.
*
* @param {string} event.name The dog's name.
*/
export const handler = async (event) => {
try {
// Get IDs of dogs with the given name from PostgreSQL database
const dogIds = await getDogId(event.name);
// Get bark counts from DynamoDB table
const barkCounts = await getBarkCounts(dogIds);
return barkCounts;
} catch (err) {
// Insert additional error handling/reporting here
throw err;
}
};
With the naive packaging approach the deployment package for this function contains the complete AWS SDK even though the function uses only the DynamoDB service.
To configure webpack to bundle the function, we can treat the function like a library and follow webpack’s Authoring Libraries documentation.
Install webpack
To install webpack into the project, run:
npm install --save-dev webpack webpack-cli
Configure webpack
To configure webpack, create webpack.config.js
:
const path = require('path');
const webpack = require('webpack');
module.exports = {
entry: './src/index.js',
output: {
path: path.resolve(__dirname, 'dist'),
filename: '[name].js',
library: '[name]',
libraryTarget: 'commonjs2',
},
target: 'node',
devtool: 'source-map',
// Work around "Error: Can't resolve 'pg-native'"
plugins: [
new webpack.IgnorePlugin(/^pg-native$/)
],
};
This configuration:
- Defines the function’s entrypoint.
- The default
name
placeholder is “main”. - See https://webpack.js.org/configuration/entry-context/#entry.
- The default
- Outputs to the
dist
directory. - Sets the
libraryTarget
to use CommonJS: the “return value of your entry point will be assigned to themodule.exports
.” - Targets a Node.js environment:
require
is called to load chunks. - Enables source map generation.
- Works around an error using
node-postgres
with webpack.
Build with webpack
To build the project run webpack
. You can also add scripts to
package.json
to provide shortcuts to build in development or
production mode:
{
"scripts": {
"build": "webpack --mode=production",
"build:dev": "webpack --mode=development"
}
}
Building in production mode minifies the bundle and performs tree shaking to eliminate dead code. For more complicated projects you can write separate webpack configurations for development and production.
Building creates dist/main.js
and dist/main.js.map
: the bundled function
code and its source map. The deployment package should contain only these two
files.
Results
Using webpack to bundle the example Lambda function creates an uncompressed
deployment package that’s 33 times smaller than the uncompressed deployment
package containing node_modules
.
Packaging Approach | Zipped Size | Uncompressed Size |
---|---|---|
Naive (copy node_modules) | 7.3 MB | 56 MB |
webpack (production) | 440 KB | 1.7 MB |
Most of the savings are from bundling only the necessary modules from the AWS SDK; the example function uses only the DynamoDB service. For this to work you must import the AWS service clients directly. For example, instead of:
import AWS from 'aws-sdk';
write:
import DynamoDB from 'aws-sdk/clients/dynamodb';
See https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/webpack.html for more information.
Also note that the AWS SDK is preinstalled in the Node.js Lambda runtime, so it’s not strictly necessary to package it yourself. I prefer not to use the preinstalled library to gain control over the version, to be consistent with other dependencies, and to simplify local development.
Deploying using Terraform
Bundling using webpack also simplifies deploying the Lambda function using Terraform.
With the naive packaging approach you might write a manual build step to install the production dependencies and create the ZIP file.
With the bundling approach, we can use the archive_file data source to create a ZIP file. The Terraform configuration looks like:
data "archive_file" "my_function_archive" {
type = "zip"
output_path = "${path.module}/my_function.zip"
source_dir = "${path.module}/my_function/dist"
}
resource "aws_lambda_function" "my_function_lambda" {
filename = data.archive_file.my_function_archive.output_path
function_name = "my_function"
handler = "dist/main.handler"
source_code_hash = data.archive_file.my_function_archive.output_base64sha256
role = aws_iam_role.my_lambda_role.arn
runtime = "nodejs12.x"
}