vignettes/cloudbuild.Rmd
cloudbuild.Rmd
Cloud Build uses Docker
containers to run everything. This means it can run almost any
language/program or application including R. Having an easy way to
create and trigger these builds from R means R can serve as a UI or
gateway to any other program e.g. R can trigger a Cloud Build using
gcloud
to deploy Cloud Run applications.
The first 120 mins per day are free. See here for more priceinfo.
If you want to run scripts that can be triggered one time, batch style, and set them up to trigger on GitHub events or pub/sub, or schedule them using Cloud Scheduler then Cloud Build is suited to your use case.
If you would also like to have your R code react in real-time to events such as HTTP events, such as a website or API endpoint, consider Cloud Run.
PubSub events can also trigger builds via Build Triggers which is good for long-running tasks in reaction to events such as a Cloud Storage file being uploaded, various events around Google Cloud such as billing alerts and via custom metrics set up in Cloud Logging for practically any logged event on Google Cloud.
An overview of the various functions and how they combine with each other and later functions is shown in this diagram:
Cloud Build is centered around the cloudbuild.yaml format - you can use existing cloudbuild.yaml files or create your own in R using the cloudbuild.yaml helper functions.
An example cloudbuild.yaml is shown below - this outputs the versions of docker and echos “Hello Cloud Build” and calls an R function:
steps:
- name: 'gcr.io/cloud-builders/docker'
id: Docker Version
args: ["version"]
- name: 'alpine'
id: Hello Cloud Build
args: ["echo", "Hello Cloud Build"]
- name: 'rocker/r-base'
id: Hello R
args: ["R", "-e", "paste0('1 + 1 = ', 1+1)"]
This cloudbuild.yaml file can be built directly via the
cr_build()
function. The build will by default trigger a
website URL to open with the build logs.
b1 <- cr_build("cloudbuild.yaml")
Or you can choose to wait in R for the build to finish, like below:
b2 <- cr_build("cloudbuild.yaml", launch_browser = FALSE)
b3 <- cr_build_wait(b2)
# Waiting for build to finish:
# |===||
# Build finished
# ==CloudBuildObject==
# buildId: c673143a-794d-4c69-8ad4-e777d068c066
# status: SUCCESS
# logUrl: https://console.cloud.google.com/gcr/builds/c673143a-794d-4c69-8ad4-e777d068c066?project=1080525199262
# ...
You can export an existing build into a cloudbuild.yaml file, for instance to run in another project not using R.
cr_build_write(b3, file = "cloudbuild.yml")
There are several layers to creating Cloud Build objects in R:
cr_build()
triggers the API with Cloud Build objects
and cloudbuild.yaml filescr_build_yaml()
lets you create the Cloud Build objects
in R and can write to .yaml files.cr_build_make()
creates Cloud Build R objects directly
from .yaml filescr_buildstep()
lets you create specific steps within
the Cloud Build objects. There are helper template files with common
tasks such as cr_buildstep_r()
For common use cases the cr_deploy_*
functions use the
above functions with some sensible defaults to quickly run use cases for
you:
cr_deploy_docker()
- builds from a Dockerfile then
pushes the Docker image to Google Container Registrycr_deploy_docker_trigger()
- as above, plus creating a
build trigger so the build will run upon each git commit - see the Docker
image use case for both.cr_deploy_packagetests()
- create a build that will run
R package unit tests and a build trigger to run them upon each commit -
see the package
tests and code coverage use case
cr_deploy_pkgdown()
- create a build that will deploy
an R package website and a build trigger to create it upon each commit -
see the pkgdown
website use case.cr_deploy_run_website()
- take HTML you have in a
folder (perhaps created from an RMarkdown file) and deploy it to a nginx
website on Cloud Runcr_deploy_r()
- take the R script you supply and run it
within a Cloud Build. Optionally also schedule that R script - see the
R
code on a schedule use case.cr_deploy_run()
- takes a local folder, builds the
Docker image suitable for Cloud run and deploys to Cloud Run. Variants
include cr_deploy_plumber()
to R plumber APIs and
cr_deploy_html()
for nginx websites. See the R
micro-services, trigger
R functions from pub/sub and creating
a Slackbot use casesSome of the above are also available in an RStudio gadget.
The deploy functions call the underlying cloud build functions, so if they don’t suit you can also customise by using those. An overview on what the deploy functions use is below:
Any utility that has a Docker image can be used within Cloud Build steps.
Official Google images for the gcloud
, bq
and gsutil
are here: https://github.com/GoogleCloudPlatform/cloud-sdk-docker
Some community
contributed Cloud Build images are listed here, including hugo,
make,
and tar,
or you can configure your own Dockerfile and build what image you need
yourself, perhaps by using a previous Cloud Build and
cr_deploy_docker()
Cloud Builds sometimes need code or data to work on to be useful.
All cloudbuilds are launched in a serverless environment with a
default directory /workspace/
. The Source is copied into
this workspace before the build steps execute, so steps can share state
and files. Cloud Build sources are specified by the source
argument.
A source can be a Cloud Source Repository (perhaps mirrored from GitHub) or a Cloud Storage bucket containing the code/data you want to operate on. An example of specifying both is below:
gcs_source <- cr_build_source(
StorageSource("gs://my-bucket", "my_code.tar.gz"))
repo_source <- cr_build_source(
RepoSource("github_markedmondson1234_googlecloudrunner",
branchName="master"))
build1 <- cr_build("cloudbuild.yaml", source = gcs_source)
build2 <- cr_build("cloudbuild.yaml", source = repo_source)
cr_build_upload_gcs()
is a helper function for
automating creation of a Google Cloud Storage source - this uses googleCloudStorageR
to tar and upload your source code locally to your bucket, making it
available to your build.
This returns a Source
object that can be used in build
functions:
storage <- cr_build_upload_gcs("my_folder")
cr_build(my_yaml, source = storage)
By default this will place your local folder’s contents in the
/workspace/deploy/
folder. For buildsteps to access those
files you may want to add dir="deploy"
to them so they will
have their working directory start from there, but to be more consistent
with Build triggers (that always use sources in the root
/workspace
directory) a helpful build step can move that
deploy folder into root - see cr_buildstep_source_move()
which is intended to be the first buildstep of a build in those
situations you want to move deploy folders.
Cloud Builds can use reserved default substitutions to help with deployments in a continuous development situation. For instance, files can be named according to the Git branch they are committed from. These are listed in Google docs here and reproduced below:
There are also some GitHub specific substitutions, used mostly in Build Triggers we detail later on.
Custom macros can also be configured, starting with _$ e.g. $_MY_CUSTOM_MACRO - see the Google help files on how to handle those.
Instead of using separate cloudbuild.yml
files, you can
also choose to make your own cloudbuild.yaml files in R via
cr_build_yaml()
and cr_buildstep()
Lets say you don’t want to write a cloudbuild.yaml file manually - instead you can create all the features of the yaml files in R. Refer to the cloudbuild.yml config spec on what is expected in the files or functions.
An example below recreates a simple cloudbuild.yml file. If you print it to console it will output what the build would look like if it was in a yaml file:
cr_build_yaml(steps = cr_buildstep( "gcloud","version"))
#==cloudRunnerYaml==
#steps:
#- name: gcr.io/cloud-builders/gcloud
# args: version
You can write back out into your own cloudbuild.yml
my_yaml <- cr_build_yaml(steps = cr_buildstep( "gcloud","version"))
cr_build_write(my_yaml, file = "cloudbuild.yaml")
And also edit or extract steps from existing cloudbuild.yml files via
cr_buildstep_edit()
and
cr_buildstep_extract()
.
This allows you to programmatically create cloudbuild yaml files for other languages and triggers. See more at this article on creating custom build steps with your own Docker images.
Using the above build step editing functions, some helpful build steps you can use in your own cloudbuild steps have been included in the package.
cr_buildstep_gcloud()
- an optimised docker for
gcloud
, bq
, gsutil
or
kubectl
commandscr_buildstep_bash()
- for including bash scriptscr_buildstep_docker()
- for building and pushing Docker
imagescr_buildstep_secret()
- storing secrets in the cloud
and decrypting themcr_buildstep_decrypt()
- for using Google Key
management store to decrypt auth filescr_buildstep_git()
- for setting up and running git
commandscr_buildstep_mailgun()
- send an email with
Mailgun.orgcr_buildstep_nginx_setup()
- setup hosting HTML files
with nginx on Cloud Runcr_buildstep_pkgdown()
- for setting up pkgdown
documentation of an R packagecr_buildstep_r()
- for running R codecr_buildstep_slack()
- send a Slack webhook
messageIf you have any requests for others, please raise an issue on GitHub.
Combine buildsteps with c()
e.g.
cr_build_yaml(
steps = c(
cr_buildstep("ubuntu", "echo hello"),
cr_buildstep_gcloud("gcloud","version"),
cr_buildstep_docker("my-image", tag="dev"),
cr_buildstep_secret("my_secret","auth.json"),
cr_buildstep_r("sessionInfo()")),
images = "gcr.io/my-project/my-image")
The cr_buildstep_*
functions are all calling
cr_buildstep()
with helpful defaults, for example these are
equivalent:
cr_buildstep_bash("echo hello world")
#[[1]]
#==cloudRunnerBuildStep==
#name: ubuntu
#args:
#- bash
#- -c
#- echo hello world
cr_buildstep("ubuntu", args = c("bash", "-c", "echo hello world"), prefix = "")
#[[1]]
#==cloudRunnerBuildStep==
#name: ubuntu
#args:
#- bash
#- -c
#- echo hello world
To further aid deployments the cr_deploy_*
functions
take some common tasks and use the buildsteps above, in some cases also
creating a build trigger or schedule.
You may have some useful files or data after your buildsteps run that
you want to use later. You can specify these files as
artifacts that will be uploaded to a Google Cloud
Storage bucket after the build finishes. A helper function
cr_build_artifacts()
will take your build object and
download the files to your local directory via
googleCloudStorageR
r <- "write.csv(mtcars,file = 'artifact.csv')"
ba <- cr_build_yaml(
steps = cr_buildstep_r(r),
artifacts = cr_build_yaml_artifact('artifact.csv')
)
build <- cr_build(ba)
built <- cr_build_wait(build)
cr_build_artifacts(built)
# 2019-12-22 12:36:10 -- Saved artifact.csv to artifact.csv (1.7 Kb)
read.csv("artifact.csv")
# X mpg cyl disp hp drat wt qsec vs am gear carb
#1 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#2 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#3 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#4 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
# ... etc ...
Other common artifacts are Docker images that your build has created.
These are usually uploaded automatically to Container Registry, if say
you are using cr_deploy_docker()
.
You can view the logs of the build locally in R using
cr_build_logs()
if you pass it a build object, such as from
cr_build_wait()
(causes R to wait for build to finish) or
cr_build_status()
(checks on build status)
s_yaml <- cr_build_yaml(steps = cr_buildstep( "gcloud","version"))
build <- cr_build_make(s_yaml)
built <- cr_build(build)
the_build <- cr_build_wait(built)
cr_build_logs(the_build)
# [1] "starting build \"6ce86e05-b0b1-4070-a849-05ec9020fd3b\""
# [2] ""
# [3] "FETCHSOURCE"
# [4] "BUILD"
# [5] "Already have image (with digest): gcr.io/cloud-builders/gcloud"
# [6] "Google Cloud SDK 325.0.0"
# [7] "alpha 2021.01.22"
# [8] "app-engine-go 1.9.71"
# ...
If you are using RStudio, installing the library will enable an RStudio Addin that can be called after you have setup the library as per the setup page.
It includes a Shiny gadget that you can call via the Addin menu in
RStudio, via googleCloudRunner::cr_deploy_gadget()
or
assigned to a hotkey (I use CTRL+SHIFT+D).
This sets up a Shiny UI to help smooth out deployments as pictured:
Focusing on one buildstep in particular, since this is an R package -
you can send in R code into a build trigger using
cr_buildstep_r()
.
It accepts both inline R code or a file location. This R code is
executed in the R environment as specified in argument name
- they default to the R images provided by Rocker rocker-project.org.
If you want to build your own images (in perhaps another Cloud Build
using cr_deploy_docker()
) you can use your own R images
with custom R packages and resources.
Some useful R images have been made you could use or refer to their Dockerfiles for:
gcr.io/gcer-public/packagetools
- installs: devtools
covr rhub pkgdown goodpractice httr plumber rmarkdowngcr.io/gcer-public/render_rmd
- installs: pkgdown
rmarkdown flexdashboard blogdown bookdowngcr.io/gcer-public/googlecloudrunner
- installs:
containerit, googleCloudStorageR, plumber, googleCloudRunnerThe R code can be created within the Build at build time, or you can refer to an existing R script within the Source.
# create an R buildstep inline
cr_buildstep_r(c("paste('1+1=', 1+1)", "sessionInfo()"))
# create an R buildstep from a local file
cr_buildstep_r("my-r-file.R")
# create an R buildstep from a file within the source of the Build
cr_buildstep_r("inst/schedule/schedule.R", r_source = "runtime")
# use a different Rocker image e.g. rocker/verse
cr_buildstep_r(c("library(dplyr)", "mtcars %>% select(mpg)", "sessionInfo"),
name = "verse")
# use your own R image with custom R
my_r <- c("devtools::install()", "pkgdown::build_site()")
br <- cr_buildstep_r(my_r, name= "gcr.io/gcer-public/packagetools:latest")
# send it for building
cr_build(cr_build_yaml(steps=br))
Once you have build steps and possibly a source created, you can
either set these up to run on a schedule via cr_schedule()
or you can use triggers that will run upon certain events.
The quickest way to get going is to use the web UI for Build Triggers
(https://console.cloud.google.com/cloud-build/triggers
)
cr_build_yaml()
etc.cloudbuild.yml
file in your repository
(by default the root directory is checked)https://console.cloud.google.com/cloud-build/triggers
)https://console.cloud.google.com/cloud-build/builds
)cloudbuild.yml
file as you need, recommit to
trigger a new build.Learn how to set them up at this Google article on creating and managing build triggers.
From there you get the option of connecting a repository either by mirroring it from GitHub/Bitbucket or using Cloud Source Repositories.
You can setup either Docker builds by just providing the location of
the Dockerfile
, or have more control by providing a
cloudbuild.yml
- by default these are looked for in the
root directory.
Here are some example for this package’s GitHub repo:
gcr.io/gcer-public/packagetools
imagegcr.io/gcer-public/googlecloudrunner
imageOnce you know how to make Build Triggers in the Web UI, you may want
to move into doing them within R. This lets you scale quickly
applications and are using in some of the cr_deploy_*()
functions such as cr_deploy_docker_trigger()
.
Build Triggers can be made via R code using the
cr_buildtrigger()
function.
Build Triggers can execute on some useful triggers:
The example below shows how to set up some of the builds used in the examples above:
# get an existing cloudbuild.yaml file
cloudbuild <- system.file("cloudbuild/cloudbuild.yaml",
package = "googleCloudRunner")
# import the build into R
bb <- cr_build_make(cloudbuild, projectId = "test-project")
# specify the GitHub repo the trigger will react to
github <- cr_buildtrigger_repo("MarkEdmondson1234/googleCloudRunner", branch = "master")
# create the build trigger
cr_buildtrigger(bb, name = "trig1", trigger = github)
This build trigger will then build upon each git commit.
Build substitutions can also be configured by them into the
substitutions
argument:
# creates a trigger with named substitutions
ss <- list(`$_MYVAR` = "TEST1", `$_GITHUB` = "MarkEdmondson1234/googleCloudRunner")
# a buildtrigger with parametised substitutions
cr_buildtrigger("trig2", trigger = github, build = bb, substitutions = ss)
The above examples will create the build step in-line within the trigger, but you can also specify cloudbuild.yaml files by specifying the path to where in the repository the build file will be:
# create a trigger that will build from the file in the repo
cr_buildtrigger("cloudbuild.yaml", name = "trig3", trigger = github)
This way you can commit changes to the cloudbuild.yaml file, and they will be reflected in the Build Trigger.
You can connect via the Source Repository mirroring service or via the Google Cloud Build GitHub app - see the git setup page for more details.
It may be you want to combine build steps from across buildtriggers. For example, you have a Docker build triggers and want to add some steps that use that Docker image after its been built.
You can extract the buildsteps from deployed buildtriggers to combine them and avoid having two builds.
For example, say you have deployed a Dockerfile trigger:
repo <- cr_buildtrigger_repo("your-github/your-repo")
# first time - by default will be called "docker-{image}"
cr_deploy_docker_trigger(repo, image = "my-build")
If you want to extract the docker buildsteps you can find it in
dock_build$build$steps
below:
# get the buildtrigger details
dock_build <- cr_buildtrigger_get("docker-my-build")
# contains the buildsteps from the deployment
dock_build$build$steps
You can then combine those buildsteps in the usual way with other buildsteps. The example below assumes you’ve made an R docker image with some of your custom dependencies that you then want to immediately run your R code within:
# uses the docker image previously created
my_r_step <- cr_buildstep_r(r = "my_r_code.R", name = "docker-my-build")
# combine the buildsteps
new_buildsteps <- c(dock_build$build$steps, my_r_step)
# remake the build yaml
yml <- cr_build_yaml(new_buildsteps)
You can then reconfigure the build trigger with your new build yaml. The below writes the yaml to a file to be read by the build trigger:
# write it out to the git repo we are in
cr_build_write(yml, "new_buildsteps.yml")
repo <- cr_buildtrigger_repo("your-github/your-repo")
# overwrite the deployed build trigger to read the yml
cr_buildtrigger(
"new_buildsteps.yml",
name = "docker-my-build",
trigger = repo,
overwrite = TRUE
)
You can configure your build triggers to run from PubSub events. These events are generated from a lot of different services around GCP, such as when a file hits Cloud Storage, when certain events occur in Cloud Logging, or when your budget alerts fire. Triggering a Cloud Build can allow complex workflows responding to these events, including picking up data from the PubSub message to parametrise your code via Substitutions.
See the Creating Pub/Sub triggers documentation on how to do this in the WebUI and gcloud.
Below is an example that uses googlePubSubR
to create and send a message to PubSub, but any other service that
publishes to PubSub could be used.
For this example, the substitution variables will take its data from
the PubSub message content. It will look within the
body.message.data.x
variable, where x
is the
field sent in the pubsub message. See how to use these payload
bindings in other ways at the Google documentation.
The cloud build yaml looks like:
steps:
- name: 'alpine'
id: Hello Cloud Build
args: ["echo", "Hello Cloud Build"]
- name: 'rocker/r-base'
id: Hello R
args: ["Rscript", "-e", "paste0('From PubSub message field1:', '${_VAR1}')"]
substitutions:
_VAR1: '$(body.message.data.var1)'
Note _VAR1
which will be filled in with the
var1
field coming from the PubSub message.
# create build object
cloudbuild <- system.file("cloudbuild/cloudbuild_substitutions.yml",
package = "googleCloudRunner")
the_build <- cr_build_make(cloudbuild)
the_build
We then create the topic if it does not exist already:
# create pub/sub topic if needed
library(googlePubsubR)
pubsub_auth()
topics_create("test-topic")
We now use the function cr_buildtrigger_pubsub()
to
construct the trigger for the Cloud Build.
# create build trigger that will work from pub/subscription
pubsub_trigger <- cr_buildtrigger_pubsub("test-topic")
cr_buildtrigger(the_build, name = "pubsub-triggered-subs", trigger = pubsub_trigger)
#==CloudBuildTriggerResponse==
#id: b7fb7d8d-534d-4408-9e0c-154be6577a7f
#name: pubsub-triggered-subs
#createdTime: 2021-11-26T21:39:23.714950565Z
#==CloudBuildObject==
#substitutions:
#_VAR1 : $(body.message.data.var1)
#steps:
#- name: alpine
# args:
# - echo
# - Hello Cloud Build
# id: Hello Cloud Build
#- name: rocker/r-base
# args:
# - Rscript
# - -e
# - paste0('From PubSub message field1:', '${_VAR1}')
# id: Hello R
The build trigger should now be deployed and available in your Cloud Console.
We now simulate a PubSub message coming in with the var1
field filled. In real applications this PubSub trigger could come from
another system, Cloud Storage files, or other R scripts.
PubSub requires the message to be JSON and base64 encoded, so we use
library(jsonlite)
to do this and
library(googlePubsubR)
again to send the message:
# make base64 encoded json
library(jsonlite)
message <- toJSON(list(var1 = "hello mum"))
send_me <- msg_encode(message)
# send the message!
topics_publish(PubsubMessage(send_me), "test-topic")
You should now see the Cloud Build start in your GCP console. You can check the logs for the build after it has run to see if it worked:
# did it work?
cr_buildtrigger_logs("pubsub-triggered-subs")
# yep!
#...
#[55] "Step #1 - \"Hello R\": Status: Downloaded newer image for rocker/r-base:latest"
#[56] "Step #1 - \"Hello R\": docker.io/rocker/r-base:latest"
#[57] "Step #1 - \"Hello R\": [1] \"From PubSub message field1:[\\\"hello mum\\\"]\""
#[58] "Finished Step #1 - \"Hello R\""
#[59] "PUSH"
#[60] "DONE"
See also cr_schedule_pubsub()
which can use PubSub to
trigger scheduled builds. There is a guide using that within Cloud Scheduler to trigger Build Triggers via
Pubsub
The targets
package is a data pipeline toolkit for R which allows for
reproducible workflows without having to run the entire pipeline each
time. You can use it to create a graph of dependencies (directional
acyclic graphs, or DAG).
It is convenient to have these pipelines run in the cloud and in a
serverless manner in particular which googleCloudRunner
facilitates via the cr_build_targets()
function. Since the
pipeline can run in a Cloud Build, you do not need to have your local
machine on to run the pipeline. Cloud Build allows up to 24hrs timeout
values for the build, and costs are much lower than having a dedicated
computing cluster as you are only charged for the time the job is
running.
The integration looks through your existing targets
pipeline and creates the appropriate build steps for your build, reusing
the functions and target meta data in your local folder. By selecting to
run on Cloud Build, the local targets metadata folder is uploaded to
Cloud Storage, used to coordinate the different target pipeline steps
(targets
) and then downloads the results back to your local
session.
The cr_buildsteps_targets()
function is the base
buildstep you can use to create your workflows. Its more likely you will
use the cr_buildstep_targets_multi()
function which will
use these buildsteps to create your pipeline, including trying to
optimise the running time as much as possible by setting up concurrent
execution if possible.
In general the integration takes care of these steps:
_targets/meta
) to a Cloud Storage bucketYou will need to create an R based Dockerfile to run the target
functions within, including at least targets and any other libraries you
need for your functions to execute. By default it will use
gcr.io/gcer-public/targets
which is an image containing
targets
only.
An overview of the targets functions is shown below:
The below example shows a minimal use case of running the pipeline immediately:
write.csv(mtcars, file = "mtcars.csv", row.names = FALSE)
targets::tar_script(
list(
targets::tar_target(file1,
"mtcars.csv", format = "file"),
targets::tar_target(input1,
read.csv(file1)),
targets::tar_target(result1,
sum(input1$mpg)),
targets::tar_target(result2,
mean(input1$mpg)),
targets::tar_target(result3,
max(input1$mpg)),
targets::tar_target(result4,
min(input1$mpg)),
targets::tar_target(merge1,
paste(result1, result2, result3, result4))
),
ask = FALSE)
bs <- cr_buildstep_targets_multi()
# run it immediately in cloud
cr_build_targets(bs, execute="now")
The target pipeline is represented by this DAG (generated by
targets::tar_visnetwork()
)
When you issue cr_buildstep_targets_multi()
the targets
DAG is analysed and used to create the buildsteps:
bs <- cr_buildstep_targets_multi()
#ℹ 2022-01-20 15:26:56 > targets cloud location: gs://your-bucket/your-project
#ℹ 2022-01-20 15:26:56 > Resolving targets::tar_manifest()
#── # Building DAG: ───────────────────────────────────────────────────────────
#• [get previous _targets metadata] -> [file1]
#• [file1] -> [input1]
#• [input1] -> [result1]
#• [input1] -> [result2]
#• [input1] -> [result3]
#• [input1] -> [result4]
#• [result1, result2, result3, result4] -> [merge1]
#• [merge1] -> [ Upload Artifacts ]
The [get previous _targets metadata]
buildstep is
boilerplate generated by cr_buildstep_targets_setup()
,
whilst [ Upload Artifacts ]
is the one generated by
cr_buildstep_targets_teardown()
.
The DAG is created by populating the id
and
waitFor
arguments within a cr_buildstep()
.
These buildsteps could be used directly in cr_build()
but
you would need to handle uploading the data sources etc. which is what
cr_build_targets()
helps with.
Passing these buildsteps into cr_build_targets()
allows
you to select execute="now"
which will run the pipeline
immediately. execute="trigger"
will only create the YAML
for you to use in constructing a buildstep.
cr_build_targets(bs, execute="now")
will start off a
build like the below example:
cr_build_targets(bs, execute="now")
# Uploading . to gs://your-buucket/.20220120152659.tar.gz ───────────────────────
── 2022-01-20 15:27:00 > Tarring files in tmpdir:
ℹ
• _pkgdown.yml
• _targets.R
• etc ...
...
- logs:
ℹ Cloud Build started <https://console.cloud.google.com/cloud-build/builds/17ce8d71-593c-44f0-8990-9a3192016b0f>
...
2022-01-20 15:28:42 > targets cloud location: gs://your-bucket//your-local
ℹ 2022-01-20 15:28:42 > Downloading to download_folder: /your-local/_targets
ℹ /your-local/_targets/buildtime.txt to _targets/buildtime.txt ( 29 bytes )
✓ Saved /your-local/_targets/meta/meta to _targets/meta/meta ( 1 Kb )
✓ Saved /your-local/_targets/meta/process to _targets/meta/process ( 56 bytes )
✓ Saved /your-local/_targets/meta/progress to _targets/meta/progress ( 271 bytes )
✓ Saved /your-local/_targets/objects/input1 to _targets/objects/input1 ( 861 bytes )
✓ Saved /your-local/_targets/objects/merge1 to _targets/objects/merge1 ( 77 bytes )
✓ Saved /your-local/_targets/objects/result1 to _targets/objects/result1 ( 52 bytes )
✓ Saved /your-local/_targets/objects/result2 to _targets/objects/result2 ( 52 bytes )
✓ Saved /your-local//_targets/objects/result3 to _targets/objects/result3 ( 52 bytes )
✓ Saved /your-local/_targets/objects/result4 to _targets/objects/result4 ( 52 bytes )
✓ Saved
# Built targets on Cloud Build with status: SUCCESS ─────────────────────────
── 2022-01-20 15:28:44 > Build artifacts downloaded to /your-local/_targets ℹ
Inspecting your local _targets
metadata (via
targets::tar_meta()
or otherwise) should confirm the steps
have been updated via the Cloud Build code.
As the targets are running within a Cloud Build, all the usual methods of triggering a build are available via Build Triggers. This includes executing in response to events such as a Cloud Storage bucket being updated with a file, a GitHub commit, a PubSub event or a scheduled PubSub call.
Since pipelines are normally waiting for Cloud Storage files or say BigQuery table uploads, the new PubSub triggers for build Triggers are particularly convenient, since they can trigger on pretty much any event within GCP via Cloud Logging sinks.
The example below will use a GitHub source and trigger upon a scheduled PubSub event.
The Cloud Build will be the same as the example above, only this time
we select execute="trigger"
cr_build_targets(bs, execute="trigger")
2022-01-20 16:18:28 > Writing to cloudbuild_targets.yaml ℹ
The cloudbuild_targets.yaml
file will sit in the same
GitHub repo as the target pipeline, and all shall be commit to
GitHub.
The PubSub topic can be an existing one but for this example we
create our own via googlePubsubR
and schedule a hit to it every 24hrs at 0415.
library(googlePubsubR)
topic <- topics_create("target_pubsub_schedule")
schedule_me <- cr_schedule_pubsub(topic)
cr_schedule("target_pubsub_schedule",
schedule = "15 4 * * *",
pubsubTarget = schedule_me)
Now the PubSub topic is being triggered on a schedule, we create a Build Trigger that will respond to it.
The build trigger for a GitHub source is shown below - it uses a
parameter sourceToBuild
to specify which git repo to build
upon:
# no regex allowed for sourceToBuild repo objects
gh <- cr_buildtrigger_repo("MarkEdmondson1234/googleCloudRunner", branch = "master")
pubsub_sub <- cr_buildtrigger_pubsub(topic)
cr_buildtrigger("cloudbuild_targets.yaml",
name = "targets-scheduled-demo",
sourceToBuild = gh,
trigger = pubsub_sub)
In summary the GCP resources you now have should be:
googlePubsubR::topics_get("target_pubsub_schedule")
cr_schedule_get("target_pubsub_schedule")
==CloudScheduleJob==
: projects/your-project/locations/europe-west1/jobs/target_pubsub_schedule
name: ENABLED
state: projects/your-project/topics/target_pubsub_schedule
pubsubTarget.topicName: InByb2plY3RzL21hcmstZWRtb25kc29uLWdkZS90b3BpY3MvdGFyZ2V0X3B1YnN1Yl9zY2hlZHVsZSI=
pubsubTarget.datapubsubTarget.data (unencoded): "projects/your-project/topics/target_pubsub_schedule"
: 2022-01-20T15:55:28Z
userUpdateTime: 15 4 * * *
schedule: 2022-01-26T03:15:00.867249Z
scheduleTime: Europe/Copenhagen timezone
cr_buildtrigger_get("targets-scheduled-demo")
==CloudBuildTriggerResponse==
: 8c4f3447-b718-4252-9b6a-45a504730de2
id: targets-scheduled-demo
name: 2022-01-20T21:32:00.603354443Z
createdTime: cloudbuild_targets.yaml
filename: https://github.com/MarkEdmondson1234/googleCloudRunner
sourceToBuild.uri: refs/heads/master
sourceToBuild.ref: GITHUB
sourceToBuild.repoType==CloudBuildTriggerPubSubConfig==
: projects/your-project/topics/target_pubsub_schedule
topic: projects/123434564/subscriptions/gcb-targets-scheduled-demo subscription
You can modify each of these moving parts independently, for example swapping the pubsub for a Cloud Storage event that triggers many builds, or parametrising your PubSub trigger to trigger different behaviour. It is recommend for the first attempt to keep it simple with a one-to-one relationship between builds, pubsub topics and triggers.
cr_buildtrigger_logs()
which downloads the
last build of the given build trigger:cr_buildtrigger_logs("targets-scheduled-demo")
1] "starting build \"b3a2c366-3280-4ecb-a7c3-9916233e2dc7\""
[2] ""
[
...96] "Step #1 - \"file1\": gcr.io/gcer-public/targets:latest"
[97] "Step #1 - \"file1\": ✔ skip target file1"
[98] "Step #1 - \"file1\": ✔ skip pipeline"
[99] "Finished Step #1 - \"file1\""
[100] "Starting Step #2 - \"input1\""
[101] "Step #2 - \"input1\": Already have image (with digest): gcr.io/gcer-public/targets"
[102] "Step #2 - \"input1\": ✔ skip target file1"
[103] "Step #2 - \"input1\": ✔ skip target input1"
[104] "Step #2 - \"input1\": ✔ skip pipeline"
[105] "Finished Step #2 - \"input1\""
[106] "Starting Step #3 - \"result1\""
[107] "Starting Step #6 - \"result4\""
[108] "Starting Step #5 - \"result3\""
[109] "Starting Step #4 - \"result2\""
[110] "Step #4 - \"result2\": Already have image (with digest): gcr.io/gcer-public/targets"
[111] "Step #6 - \"result4\": Already have image (with digest): gcr.io/gcer-public/targets"
[112] "Step #5 - \"result3\": Already have image (with digest): gcr.io/gcer-public/targets"
[113] "Step #3 - \"result1\": Already have image (with digest): gcr.io/gcer-public/targets"
[114] "Step #5 - \"result3\": ✔ skip target file1"
[115] "Step #5 - \"result3\": ✔ skip target input1"
[116] "Step #5 - \"result3\": ✔ skip target result3"
[117] "Step #5 - \"result3\": ✔ skip pipeline"
[118] "Finished Step #5 - \"result3\""
[119] "Step #3 - \"result1\": ✔ skip target file1"
[120] "Step #3 - \"result1\": ✔ skip target input1"
[121] "Step #3 - \"result1\": ✔ skip target result1"
[122] "Step #3 - \"result1\": ✔ skip pipeline"
[123] "Step #4 - \"result2\": ✔ skip target file1"
[124] "Step #4 - \"result2\": ✔ skip target input1"
[125] "Step #4 - \"result2\": ✔ skip target result2"
[126] "Step #4 - \"result2\": ✔ skip pipeline"
[127] "Finished Step #3 - \"result1\""
[128] "Step #6 - \"result4\": ✔ skip target file1"
[129] "Step #6 - \"result4\": ✔ skip target input1"
[130] "Step #6 - \"result4\": ✔ skip target result4"
[131] "Step #6 - \"result4\": ✔ skip pipeline"
[132] "Finished Step #4 - \"result2\""
[133] "Finished Step #6 - \"result4\""
[134] "Starting Step #7 - \"merge1\""
[135] "Step #7 - \"merge1\": Already have image (with digest): gcr.io/gcer-public/targets"
[136] "Step #7 - \"merge1\": ✔ skip target file1"
[137] "Step #7 - \"merge1\": ✔ skip target input1"
[138] "Step #7 - \"merge1\": ✔ skip target result1"
[139] "Step #7 - \"merge1\": ✔ skip target result2"
[140] "Step #7 - \"merge1\": ✔ skip target result3"
[141] "Step #7 - \"merge1\": ✔ skip target result4"
[142] "Step #7 - \"merge1\": ✔ skip target merge1"
[143] "Step #7 - \"merge1\": ✔ skip pipeline"
[144] "Finished Step #7 - \"merge1\""
[145] "Starting Step #8 - \"Upload Artifacts\""
[146] "Step #8 - \"Upload Artifacts\": Already have image (with digest): gcr.io/google.com/cloudsdktool/cloud-sdk:alpine"
[147] "Step #8 - \"Upload Artifacts\": Copying file://buildtime.txt [Content-Type=text/plain]..."
[148] "Step #8 - \"Upload Artifacts\": / [0 files][ 0.0 B/ 29.0 B] "
[149] "/ [1 files][ 29.0 B/ 29.0 B] "
[150] "Step #8 - \"Upload Artifacts\": Operation completed over 1 objects/29.0 B. "
[151] "Step #8 - \"Upload Artifacts\": Copying file:///workspace/_targets/buildtime.txt [Content-Type=text/plain]..."
[152] "Step #8 - \"Upload Artifacts\": / [0 files][ 0.0 B/ 29.0 B] "
[153] "Copying file:///workspace/_targets/meta/progress [Content-Type=application/octet-stream]..."
[154] "Step #8 - \"Upload Artifacts\": Copying file:///workspace/_targets/meta/meta [Content-Type=application/octet-stream]..."
[155] "Step #8 - \"Upload Artifacts\": / [0 files][ 0.0 B/ 273.0 B] "
[156] "/ [0 files][ 0.0 B/ 1.3 KiB] "
[157] "Copying file:///workspace/_targets/objects/result3 [Content-Type=application/octet-stream]..."
[158] "Step #8 - \"Upload Artifacts\": Copying file:///workspace/_targets/meta/process [Content-Type=application/octet-stream]..."
[159] "Step #8 - \"Upload Artifacts\": Copying file:///workspace/_targets/objects/result4 [Content-Type=application/octet-stream]..."
[160] "Step #8 - \"Upload Artifacts\": Copying file:///workspace/_targets/objects/input1 [Content-Type=application/octet-stream]..."
[161] "Step #8 - \"Upload Artifacts\": / [0 files][ 0.0 B/ 1.4 KiB] "
[
...200] "Finished Step #8 - \"Upload Artifacts\""
[201] "PUSH"
[202] "DONE" [