To run a function on AWS Lambda you create a deployment package that contains the function code and its dependencies. In this post, I show how to use webpack to ensure that the deployment package for a Node.js Lambda function is as small as possible.

Naive packaging

The simplest way to create a deployment package is to ZIP the function code alongside its runtime dependencies. The directory structure to package for a typical Node.js Lambda function looks like:

my-function
├── dist
│   └── index.js
└── node_modules
    ├── aws-sdk
    ├── pg
    ...

Here, the node_modules directory contains the dependencies installed by a command like npm ci --only=production and the dist directory contains the function code built for distribution, perhaps transpiled by Babel.

To create the deployment package, enter the my-function directory and run:

$ zip -r my-function.zip dist node_modules

This naive packaging approach works, but the deployment package is much larger than necessary. The package contains the full contents of all the dependencies including assets, documentation, tests, and TypeScript files.

Lambda limits

AWS Lambda function deployment packages are limited to 250 MB unzipped, including layers. This limit provides motivation to minimize its size.

Ideally, the deployment package contains only:

  • the function code
  • the dependency modules used by the function code

We can create such a deployment package using a bundler like webpack.

Bundling with webpack

Webpack enables creating a bundle that contains only the modules necessary to run the function. This section introduces an example Lambda function and walks through the steps to create a minimal deployment package using webpack.

Example function

Here’s a contrived Node.js v12 function that:

  • Accepts a dog’s name as input.
  • Queries a PostgreSQL database for the IDs of dogs with that name.
  • Queries a DynamoDB table to get the dogs’ bark counts: the number of times each dog has barked.

The function requires two libraries: node-postgres and aws-sdk.

Note that the PGDATABASE, PGHOST, PGUSER, PGPASSWORD, and PGPORT environment variables configure access to the SQL database using node-postgres. DYNAMODB_TABLE_NAME and DYNAMODB_TABLE_HASH_KEY configure the DynamoDB table information; the ID is the table’s hash key.

index.js:

import { Client } from 'pg';
import DynamoDB from 'aws-sdk/clients/dynamodb';

const DYNAMODB_TABLE_NAME = process.env.DYNAMODB_TABLE_NAME;
const DYNAMODB_TABLE_HASH_KEY = process.env.DYNAMODB_TABLE_HASH_KEY;

/**
 * Query the PostgreSQL database for the IDs of dogs with the specified name.
 *
 * @param {string} name The name.
 * @return Array containing the IDs of dogs with the specified name.
 */
const getDogId = async (name) => {
  const client = new Client();
  await client.connect();

  const query = 'SELECT id FROM dogs WHERE name=$1';
  const values = [name];
  const queryResult = await client.query(query, values);

  await client.end();

  let ids = [];

  if (queryResult.rowCount > 0) {
    ids = queryResult.rows.map((row) => row.id);
  }

  return ids;
};

/**
 * Query the DynamoDB table for each dog's bark count.
 *
 * @param {number[]} ids The dogs' IDs.
 * @return The bark count for each ID.
 */
const getBarkCount = async (id) => {
  let results;

  if (ids.length > 0) {
    const config = {
      region: 'us-east-1',
    };
    const documentClient = new DynamoDB.DocumentClient(config);

    const keys = ids.map((id) => {
      return {
        [DYNAMODB_TABLE_HASH_KEY]: id,
      };
    });

    const params = {
      RequestItems: {
        [DYNAMODB_TABLE_NAME]: {
          Keys: keys,
          ProjectionExpression: 'barkCount',
        },
      },
    };

    const data = await documentClient.batchGet(params).promise();
    results = data.Responses[DYNAMODB_TABLE_NAME];
  }

  return ids.map((id) => {
    return {
      id,
      barkCount: results[id] ? results[id].barkCount : 0,
    };
  });
};

/**
 * Lambda function handler to retrieve how many times dogs have barked.
 *
 * @param {string} event.name The dog's name.
 */
export const handler = async (event) => {
  try {
    // Get IDs of dogs with the given name from PostgreSQL database
    const dogIds = await getDogId(event.name);

    // Get bark counts from DynamoDB table
    const barkCounts = await getBarkCounts(dogIds);

    return barkCounts;
  } catch (err) {
    // Insert additional error handling/reporting here
    throw err;
  }
};

With the naive packaging approach the deployment package for this function contains the complete AWS SDK even though the function uses only the DynamoDB service.

To configure webpack to bundle the function, we can treat the function like a library and follow webpack’s Authoring Libraries documentation.

Install webpack

To install webpack into the project, run:

npm install --save-dev webpack webpack-cli

Configure webpack

To configure webpack, create webpack.config.js:

const path = require('path');
const webpack = require('webpack');

module.exports = {
  entry: './src/index.js',
  output: {
    path: path.resolve(__dirname, 'dist'),
    filename: '[name].js',
    library: '[name]',
    libraryTarget: 'commonjs2',
  },
  target: 'node',
  devtool: 'source-map',
  // Work around "Error: Can't resolve 'pg-native'"
  plugins: [
    new webpack.IgnorePlugin(/^pg-native$/)
  ],
};

This configuration:

Build with webpack

To build the project run webpack. You can also add scripts to package.json to provide shortcuts to build in development or production mode:

{
  "scripts": {
    "build": "webpack --mode=production",
    "build:dev": "webpack --mode=development"
  }
}

Building in production mode minifies the bundle and performs tree shaking to eliminate dead code. For more complicated projects you can write separate webpack configurations for development and production.

Building creates dist/main.js and dist/main.js.map: the bundled function code and its source map. The deployment package should contain only these two files.

Results

Using webpack to bundle the example Lambda function creates an uncompressed deployment package that’s 33 times smaller than the uncompressed deployment package containing node_modules.

Packaging ApproachZipped SizeUncompressed Size
Naive (copy node_modules)7.3 MB56 MB
webpack (production)440 KB1.7 MB

Most of the savings are from bundling only the necessary modules from the AWS SDK; the example function uses only the DynamoDB service. For this to work you must import the AWS service clients directly. For example, instead of:

import AWS from 'aws-sdk';

write:

import DynamoDB from 'aws-sdk/clients/dynamodb';

See https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/webpack.html for more information.

Also note that the AWS SDK is preinstalled in the Node.js Lambda runtime, so it’s not strictly necessary to package it yourself. I prefer not to use the preinstalled library to gain control over the version, to be consistent with other dependencies, and to simplify local development.

Deploying using Terraform

Bundling using webpack also simplifies deploying the Lambda function using Terraform.

With the naive packaging approach you might write a manual build step to install the production dependencies and create the ZIP file.

With the bundling approach, we can use the archive_file data source to create a ZIP file. The Terraform configuration looks like:

data "archive_file" "my_function_archive" {
  type        = "zip"
  output_path = "${path.module}/my_function.zip"
  source_dir  = "${path.module}/my_function/dist"
}

resource "aws_lambda_function" "my_function_lambda" {
  filename         = data.archive_file.my_function_archive.output_path
  function_name    = "my_function"
  handler          = "dist/main.handler"
  source_code_hash = data.archive_file.my_function_archive.output_base64sha256
  role             = aws_iam_role.my_lambda_role.arn
  runtime          = "nodejs12.x"
}