Automated PDF generation in AWS Lambda

Around 7 years ago, my dad, who’s a lorry driver, started working for a new company and they needed him to submit an invoice every week. Being completely hopeless on a computer, he turned to me to set him something up, so I created a simple template on Microsoft Word. I showed him how to enter that week’s data into the text fields and how to attach the doc to an email - and he took diligent notes. Eventually he passed the job onto my mum, who then needed more instructions from me. Recently they went on holiday and were getting stressed out worrying about having to put the template on her iPad and trying to do it whilst they were away. This got me thinking - I’m a software develop now, surely I could automate this for them so they won’t have to worry about it again.

The process

The workflow here is fairly straight-forward;

my dad receives an email that contains a ‘movement confirmation’ PDF that has the data for his last week’s work

that data is put into the invoice template

the completed template is sent back in an email

So with that in mind, I broke the solution down into the following steps

That’s a pretty concise list of problems to tackle, some of which are easier than others, so let’s take a look at them and I’ll explain the solutions I considered and ultimately decided on implementing.

Parsing the data out of the movement confirmation

Starting with the most boring problem. I found a library called pdf-parse that could read the pdf file into a plain string. Then, armed with years of experience doing string-parsing katas, it’s easy enough to find the data you need with some regular expressions. Whilst this is the most boring problem, it’s also the most fickle part of the app - if the format of the movement confirmation is changed even slightly from what is expected, then the parsing function won’t be able to find the data we need and the whole lambda function will fail.

Generating an invoice

In a previous role I’ve used Gotenberg, which uses headless Chrome to render html and then converts that into a pdf file. At the time we were working in .NET so we used razor templating to set up our html files. As I was writing the lambda function in Typescript, I had a look around for a node-based templating solution, similar to what I’d done with razor and came across Pug.

HTML templating

The basic concept with Pug is to put together a .pug file that contains your template, load that template with an object that contains your data to use and then render that out into an html string:


const compiledFunc = pug.compileFile('./src/invoice-template.pug')

const templateParams = {
    totalNet: 500,
    totalVat: 100
}

const htmlString = compiledFunc(templateParams)

And in the pug template we can use those values as such:


div(class="charges")
    p £#{totalNet.toFixed(2)}
    p £#{totalVat.toFixed(2)}

Which will generate a div with a class of charges that contains two paragraphs, the first of which will show ‘£500.00’ and the second ‘£100.00’. The rest of the template went together in a similar fashion and the css file was referenced using the same syntax:


doctype html
html(lang="en")
head
    link(rel="stylesheet" href="/styles.css")

Rendering the HTML and producing a PDF

The obvious choice for headless chrome in node is Puppeteer and when testing locally this worked perfectly. The problem is that Puppeteer expects to be able to find a chrome executable on the system, which my development machine has but the default lambda container does not. Thankfully, I found this package which bundles up everything that we need in order to run chrome in lambda.

Installing the package and following the instructions on the repo homepage had it working perfectly, so I’m now able to generate a html string, load that into chrome, render it as a webpage and then convert that page into a pdf:


const htmlString = compiledFunc(templateParams) // From above

const executablePath = await chromium.executablePath

// Get a broswer object
const browser = await chromium.puppeteer.launch({
    args: chromium.args,
    defaultViewport: chromium.defaultViewport,
    executablePath: executablePath,
    headless: chromium.headless,
    ignoreHTTPSErrors: true,
})

// Load a new page and set the content to our HTML
const page = await browser.newPage()
await page.setContent(htmlString)
// Load in the style sheet so that the HTML reference works
await page.addStyleTag({ path: './src/styles.css' })

// Capture the page as a pdf. I crafted my template to
// only take up 1 page, but for some reason it always generates
// 2. So the pageRanges option ensures I only capture the
// first page
const pdfFile = await page.pdf({ format: "A4", pageRanges: '1' })

await browser.close()

Getting the movement confirmation from Outlook

This initially seemed to be the most complex part of the app and I explored several different solutions before eventually landing on the final implementation.

Set up a trigger from Azure

At first I thought I would be able to invoke my lambda function via api on a trigger from something inside Azure and this definitely seems like a viable solution, but my knowledge on Azure is fairly limited, which ultimately made me look elsewhere.

Logic Apps has a sort of drag-and-drop UI that would allow me to select ‘Receive an email in Outlook’ as a trigger and then I’m sure there would be some way of posting the email to an api as an action. However I found this quite confusing to set up in the portal and after playing around with it for a couple of hours still didn’t have any luck, plus I think it would have been relatively expensive (relative to the final solution).

Event Grid also offers the possibility to use Outlook as an event provider. So in a similar way to above, I could set something up to listen to the events that I was interested in and then set an action to call my lambda through an api.

Graph API

I also looked into to using the Graph API to manually look into the inbox to see if there is a new email that needs processing. This solution would have required me to first get permission to run on a user’s behalf and store the refresh token somewhere. Then I would have to set up the lambda to run on a schedule, most likely using Cloudwatch events, to retrieve the refresh token, swap it for an access token, retrieve the emails in the user’s inbox and then find any email that requires processing. This solution wouldn’t have required me to set anything up in Azure which suits my skillset better, but I struggled to get the permissions aspect right and ended up abandoning this idea.

Zapier

Zapier is a ‘no-code’ solution for setting up integrations between different apps, including Outlook. It’s extremely simple to get a new Zap set up, and in very little time I had the below up and running:

As you can see, there is a single trigger that fires whenever a new email is received into the user’s inbox and we then use an action to filter out emails that we’re not interested in. The email data is then sent into the second action which is some custom JavaScript code that runs in a node environment via the Code by Zapier integration. Finally, once the lambda function has done its thing and returned the generated invoice, we then use the Outlook integration again, but this time to send the invoice onto its target.

This worked flawlessly and I thought that I was done. However I then discovered that when I signed up to Zapier, I had automatically been put into a free trial of their ‘Premium’ tier and that their free tier is way more limiting.

The free tier

The first problem I saw was that you can only run single-stage zaps on the free tier, which means that your trigger can only feed into one action, whereas I have three. Okay no problem, I thought. I can move the filtering into the custom JavaScript to cut out the first action and I suppose I can set something up using Amazon SES to send the invoice directly from the lambda function.

Fine. However the far larger limitation that I discovered later is that you are limited to just 100 executions per month! This means that if my dad were to receive more than 100 emails in a month then the app would stop running, which is a likely scenario. The next tier in Zapier costs almost £20/month which I was certainly not interested in paying, so Zapier became a no.

Amazon SES

Wracked with despair and the concept of having to figure out how to get this working in Azure, I remembered that I had set up a domain in SES a little while ago to play around and do some testing. I had the idea that I should be able to forward on the movement confirmation emails from my dad’s outlook account into my SES account and then trigger my lambda from there.

Super simple and reliable. I set up a rule on Outlook to forward on emails when the sender matches the intended sender and that lands in SES and configured the SES ruleset to trigger my lambda function. The function already knows how to parse the movement confirmation and generate an invoice, so all I needed to do was add the functionality to send the invoice in an email. I did that using the MailComposer module of Nodemailer and then used the sendRawEmail function in SES, as the sendEmail function has no way of sending attachments


const sesClient = new SES({})

const mail = new MailComposer({
    from: "Jack O'Hara <jack@my-domain.com",
    replyTo: 'Reply To Me <some-other-email@domain.com>',
    to: process.env.DESTINATION_EMAIL_ADDRESS,
    bcc: getBccAddresses(),
    // The in-reply-to header helps email clients to recognise
    // which emails should be linked together in a chain
    inReplyTo,
    subject: `RE: ${originalSubject}`,
    text: `Hi,\n\nPlease find invoice ${invoiceNumber} attached, with regards to movement confirmation ${movementConfirmation}.\n\nThanks`,
    attachments: [
        {
            filename: `Invoice ${invoiceNumber}.pdf`,
            content: invoice
        }
    ],
    // As with in-reply-to, references are added to help
    // email clients know what this email is linked to
    references: getReferences(references, inReplyTo)
})

// This is necessary if you want to have bcc recipients
const compiledMail = mail.compile() as MimeNode & {keepBcc: boolean}
compiledMail.keepBcc = true

const builtMail = await compiledMail.build()

await sesClient.sendRawEmail({
    RawMessage: { Data: builtMail }
})

Something to note…

The puppeteer part of the lambda function failed to work for me when I was running the function code from a zip file, so I had to convert it over to container image. I imagine this is because there’s some install script which makes sure that all the underlying chrome libraries are installed on the OS. My dockerfile is doing nothing fancy:


FROM public.ecr.aws/lambda/nodejs:16

COPY . ${LAMBDA_TASK_ROOT}

RUN npm ci

# Runs tsc
RUN npm run build 

CMD ["app.handler"]

Final thoughts

I’ve spent most of my evenings and weekends over the last two or three weeks developing this and, to be honest, it probably isn’t worth it. My dad is planning on retiring in the next couple of years so the amount of time it will save them is most likely less than the amount of time I spent developing it. But hopefully it relieves a little bit of stress in the time that it is running and I’ve also had fun and learned a lot which is usually the biggest reason I decide to take on side projects, anyway.

Something else to note is that this is basically free to run; the lambda invocations should be so low that I never incur a charge there; I’m only sending 4 or 5 emails a month so SES shouldn’t cost anything; the only charges that I will incur are storage costs for the emails in S3 and my container image in ECR.

Finally, I haven’t made the code for this repo public because some of my dad’s personal info is included in the pug template. I do plan on removing that and cleaning up the git history at some point in the future so that you can all see the code alongside this article - I’ll make sure to come back and edit this to include a link once I do!