Serverless, on-demand parameterised R Markdown reports with Google Cloud Run

This use case is inspired by this post by David Neuzerling who showed how it can be done library(lambdr), which is a serverless HTML service similar to Cloud Run but on AWS.

In this use case we would like an Rmd file to be rendered each time it is requested by HTTP requests. This is opposed to scheduled renders which can host the HTML on Cloud Storage or use Cloud Run as a nginx server to host that HTML (detailed in this use case: Build an Rmd on a schedule and host its HTML)

Since the HTTP request will be executing code each time this is more expensive in compute cost and also bear in mind that although Cloud Run has execution times of up to 60mins the web browsers requesting the page will timeout in at most a minute, so the Rmd you use should not have expensive computations. If you do, schedule a build of the HTML instead.

You can see an example of the below running at https://cloudrun-rmarkdown-ewjogewawq-ew.a.run.app/ which you can also deploy yourself via the following code:

library(googleCloudRunner)

deploy_folder <- system.file("cloudrun_rmarkdown", package = "googleCloudRunner")

cr_deploy_plumber(deploy_folder)

Strategy

To enable this use case we will need:

  • A plumber API to receive the HTTP request, render the Rmd and serve up the result
  • An R Markdown (Rmd) to render into HTML
  • A Docker file containing plumber, rmarkdown, any libraries you need for your Rmd to run, plus any system depdencies.

plumber API

The plumber API needs to only render the Rmarkdown. I also include a separate server.R file for launching plumber in Cloud Run in the container.

The api.R receives the parameter in the URL and passes it into the rmarkdown render, outputting the HTML.

#' Plot out data from the mtcars
#' @param cyl If provided, passed into Rmd rendering parameters
#' @get /
#' @serializer html
function(cyl = NULL){

  # to avoid caching a timestamp is added
  outfile <- sprintf("mtcars-%s.html", gsub("[^0-9]","",Sys.time()))

  # render markdown to the file
  rmarkdown::render(
    "mtcars.Rmd",
    params = list(cyl = cyl),
    envir = new.env(),
    output_file = outfile
  )

  # read html of file back in and use as response for plumber
  readChar(outfile, file.info(outfile)$size)
}

The server.R is generic boilerplate that launches the plumber API within the Cloud Run container

pr <- plumber::plumb("api.R")
pr$run(host='0.0.0.0', port=as.numeric(Sys.getenv('PORT')), swagger=TRUE)

R Markdown

Key parts from the example R Markdown is shown below. It is expected you modify this to something more exciting:

---
title: "Parameterised Cars on Cloud Run"
author: "Mark Edmondson"
date: "11/26/2021"
output: html_document
params:
  cyl: 6
---

The above sets up the parameter you will use later in the document via params$cyl - see below for its use in an R chunk within the RMarkdown document:

# parameters are blank on first load
if(is.null(params$cyl)){
  the_data <- mtcars
} else {
  the_data <- mtcars[mtcars$cyl == params$cyl, ]
}

knitr::kable(the_data)

See the actual website which is rendered with the code for the full version.

Docker file

The Dockerfile needs the libraries plumber, rmarkdown plus anything else you are using in your Rmd. I have an existing image with plumber installed that I build from here. This is put in the same folder as the plumber and Rmd scripts.

FROM gcr.io/gcer-public/googlecloudrunner:master
RUN export DEBIAN_FRONTEND=noninteractive; apt-get -y update \
  && apt-get install -y git-core \
    libcurl4-openssl-dev \
    libssl-dev \
    make \
    pandoc \
    pandoc-citeproc \
    zlib1g-dev
RUN ["install2.r", "rmarkdown"]

COPY ["./", "./"]
ENTRYPOINT ["Rscript", "server.R"]

Deployment

Once all the files are in a folder, you can point the deployment function at it to deploy it.

The deployment function will:

You can customise the steps by using the specific buildstep functions directly.

library(googleCloudRunner)

deploy_folder <- "your-folder"

cr_deploy_plumber(deploy_folder)

You may also want to set up a build trigger so when you commit code it will update the deployed Cloud Run.