vignettes/cloudscheduler.Rmd
cloudscheduler.RmdCloud Scheduler is a scheduler service in the Google Cloud that uses cron like syntax to schedule tasks. It can trigger HTTP or Pub/Sub jobs via cr_schedule()
googleCloudRunner uses Cloud Scheduler to help schedule Cloud Builds but Cloud Scheduler can schedule HTTP requests to any endpoint:
cr_scheduler(name = "my-webhook", "14 5 * * *", httpTarget = HttpTarget(httpMethod="GET", uri = "https://mywebhook.com"))
Since Cloud Build can run any code in a container, scheduling them becomes a powerful way to setup batched data flows.
A demo below shows how to set up a Cloud Build on a schedule from R:
build1 <- cr_build_make("cloudbuild.yaml") cr_schedule("15 5 * * *", name="cloud-build-test1", httpTarget = cr_build_schedule_http(build1))
We use cr_build_make() and cr_build_schedule_http() to create the Cloud Build API request, and then send that to the Cloud Scheduler API via its httpTarget parameter.
Update a schedule by specifying the same name and the overwrite=TRUE flag. You need then need to supply what you want to change, everything else will remain as previously configured.
cr_schedule("my-webhook", "12 6 * * *", overwrite=TRUE)
A common use case is scheduling an R script. This is provided by cr_deploy_r()
A minimal example is:
# create an r script that will echo the time the_build <- cr_build_yaml(cr_buildstep_r("cat(Sys.time())")) # construct a Cloud Build API call to call that build build_call <- cr_build_schedule_http(the_build) # schedule the API call for every minute cr_schedule("test1", "* * * * *", httpTarget = build_call) # you should return a scheduler object test_schedule <- cr_schedule_get("test1") # once finished, delete the schedule cr_schedule_delete("test1")
After it triggers you should see a “SUCCESS” in the Cloud Scheduler console and associated builds in the Cloud Build web UI.
The above assumes you have followed the recommended authentication setup using cr_setup() and cr_setup_tests() all work.
In particular you can check the email that the API call will run under on Cloud Scheduler in test_schedule$httpTarget$oauthToken$serviceAccountEmail
This example shows running R scripts across a source such as GitHub or Cloud Respositories. This is used for builds such as package checks and website builds. This uses the helper deployment function, cr_deploy_r() which is also available as an RStudio gadget.
# this can be an R filepath or lines of R read in from a script r_lines <- c("list.files()", "library(dplyr)", "mtcars %>% select(mpg)", "sessionInfo()") # example code runs against a source that is a mirrored GitHub repo source <- cr_build_source(RepoSource("googleCloudStorageR", branchName = "master")) # check the script runs ok cr_deploy_r(r_lines, source = source) # schedule the script once its working cr_deploy_r(r_lines, schedule = "15 21 * * *", source = source)
The examples above are all using the default of rocker/r-base for the R environment. If you have package depdencies for your script you would need to install them within the script.
An alternative is to customise the Docker image so it includes the R packages you need. For instance, rocker/tidyverse would load the Tidyverse packages.
You may also want to customise the R docker image further - in this case you can build your docker image first with your R libraries installed, then specify that image in your R deployment.
Once you have your R Docker file, supply it to cr_deploy_r() via its r_image argument.
cr_deploy_docker("my_folder_with_dockerfile", image_name = "gcr.io/my-project/my-image", tag = "dev") cr_deploy_r(r_lines, schedule = "15 21 * * *", source = source, r_image = "gcr.io/my-project/my-image:dev")
The logs of the scheduled scripts are in the history section of Cloud Build - each scheduled run is creating a new Cloud Build.
If you are using RStudio, installing the library will enable an RStudio Addin that can be called after you have setup the library as per the setup page.
It includes a Shiny gadget that you can call via the Addin menu in RStudio, via googleCloudRunner::cr_deploy_gadget() or assigned to a hotkey (I use CTRL+SHIFT+D).
This sets up a Shiny UI to help smooth out deployments as pictured:

If you want to customise deployments, then the steps covered by cr_deploy_r() are covered below.
To schedule an R script the steps are:
The R script can hold anything, but make sure its is self contained with auth files, data files etc. All paths should be relative to the script and available in the source you choose to build with (e.g. GCS or git repo) or within the Docker image executing R.
Uploading auth files within Dockerfiles is not recommended security wise. The recommend way to download auth files is to use the GKE encryption service, which is available as a build step macro via cr_buildstep_decrypt()
You may only need vanilla r or tidyverse, in which case select the presets “rocker/r-ver” or “rocker/verse”.
You can also create your own Docker image - point it at the folder with your script and a Dockerfile (perhaps created with cr_dockerfile())
Once you have your R script and Dockerfile in the same folder, you need to build the image.
This can be automated via the cr_deploy_docker() function supplying the folder containing the Dockerfile:
cr_deploy_docker("my-scripts/", "gcr.io/your-project/your-name")
Once the image is built successfully, you do not need to build it again for the scheduled calls - you could setup doing that only if the R code changes.
You may want your R code to operate on data in Google Cloud Storage or a git repo. Specify that source in your build, then make the build object:
This is if you have your code files within a repo like GitHub and mirrored to Cloud Source repositories.
schedule_me <- cr_build_yaml( steps = cr_buildstep("your-r-image", "R -e my_r_script.R", prefix="gcr.io/your-project") ) # maybe you want a repo source repo_source <- cr_build_source( RepoSource("MarkEdmondson1234/googleCloudRunner", branchName="master")) my_build <- cr_build_make(schedule_me, source = repo_source)
This keeps your R code source in a Cloud Storage bucket.
The first method uses a tar.gz that has zipped files in a folder that you upload:
schedule_me <- cr_build_yaml( steps = cr_buildstep("your-r-image", "R -e my_r_script.R", prefix="gcr.io/your-project") ) # upload a tar.gz of the files to use as a source: gcs_source <- cr_build_upload_gcs("local_folder_with_r_script") my_build <- cr_build_make(schedule_me, source = gcs_source)
When only a few files, it may be easiest to include downloading the R file from your bucket first into the /workspace/ via a buildstep using gsutil, not using source at all:
schedule_me <- cr_build_yaml( steps = c( cr_buildstep( id = "download R file", name = "gsutil", args = c("cp", "gs://mark-edmondson-public-read/my_r_script.R", "/workspace/my_r_script.R") ), cr_buildstep("your-r-image", "R -e /workspace/my_r_script.R", prefix="gcr.io/your-project") ) ) my_build <- cr_build_make(schedule_me)
You may want to test the build works with a one off build first:
# test your build works schedule_build <- cr_build(my_build)
Once you have a working build, schedule that build object by passing it to the cr_build_schedule_http() function, which constructs the Cloud Build API call for Cloud Scheduler to call at its scheduled times.
# create a scheduler http endpoint that will trigger your build cloud_build_target <- cr_build_schedule_http(my_build) # schedule it cr_schedule("15 5 * * *", name="scheduled_r", httpTarget = cloud_build_target)
Your R script should now be scheduled and running in its own environment.
You can automate updates to the script and/or Docker container or schedule separately, by redoing the relevant step above, perhaps adding a build trigger to do so.