googleAuthRhttps://app.iihnordic.dk/ga-effect/
searchConsoleRgoogleAuthRgoogleAnalyticsRgoogleComputeEngineR (cloudyr)bigQueryR (cloudyr)googleCloudStorageR (cloudyr)googleLanguageR (rOpenSci)googleCloudRunner (NEW!)Slack group to talk around the packages #googleAuthRverse
data.frame objectshttps://www.rocker-project.org/
Maintain useful R images
rocker/r-verrocker/rstudiorocker/tidyverserocker/shinyrocker/ml-gpuFROM rocker/tidyverse
MAINTAINER Mark Edmondson (r@sunholo.com)
# install R package dependencies
RUN apt-get update && apt-get install -y \
    libssl-dev 
## Install packages from CRAN
RUN install2.r --error \ 
    -r 'http://cran.rstudio.com' \
    googleAuthR \ 
    googleComputeEngineR \ 
    googleAnalyticsR \ 
    searchConsoleR \ 
    googleCloudStorageR \
    bigQueryR \ 
    ## install Github packages
    && installGithub.r MarkEdmondson1234/youtubeAnalyticsR Flexible No need to ask IT to install R places, use docker run; across cloud platforms; ascendant tech
Version controlled No worries new package releases will break code
Scalable Run multiple Docker containers at once, fits into event-driven, stateless serverless future
Good for R APIs
Pros
Auto-scaling
Scale from 0
Simple to deploy
https / authentication embeddedCons
Needs stateless, idempotent workflows
Limited support for ShinyMake an API out of your script:
#' Echo the parameter that was sent in
#' @param msg The message to echo back.
#' @get /echo
function(msg=""){
  list(msg = paste0("The message is: '", msg, "'"))
}
#' Plot out data from the iris dataset
#' @param spec If provided, filter the data to only this species (e.g. 'setosa')
#' @get /plot
#' @png
function(spec){
  myData <- iris
  title <- "All Species"
  # Filter if the species was specified
  if (!missing(spec)){
    title <- paste0("Only the '", spec, "' Species")
    myData <- subset(iris, Species == spec)
  }
  plot(myData$Sepal.Length, myData$Petal.Length,
       main=title, xlab="Sepal Length", ylab="Petal Length")
}Based on:
FROM trestletech/plumber
COPY [".", "./"]
ENTRYPOINT ["R", "-e", "pr <- plumber::plumb(commandArgs()[4]); pr$run(host='0.0.0.0', port=as.numeric(Sys.getenv('PORT')))"]
CMD ["api.R"]Can scale to a billion, and be available for other languages.
steps:
- name: gcr.io/gcer-public/gago:master
  args:
  - reports
  - --view=81416156
  - --dims=ga:date,ga:medium
  - --mets=ga:sessions
  - --start=2014-01-01
  - --end=2019-11-30
  - --antisample
  - --max=-1
  - -o=google_analytics.csv
  id: download google analytics
  dir: build
  env: GAGO_AUTH=/workspace/auth.json
- name: gcr.io/cloud-builders/gsutil
  args:
  - cp
  - gs://mark-edmondson-public-read/polygot.Rmd
  - /workspace/build/polygot.Rmd
  id: download Rmd template
- name: gcr.io/gcer-public/packagetools:master
  args:
  - Rscript
  - -e
  - |-
    lapply(list.files('.', pattern = '.Rmd', full.names = TRUE),
                 rmarkdown::render, output_format = 'html_document')
  id: render rmd
  dir: buildSet up a build trigger for the GitHub repo you commit the Dockerfile to:
A 40 mins talk at Google Next19 with lots of new things to try!
https://www.youtube.com/watch?v=XpNVixSN-Mg&feature=youtu.be
Great video that goes more into Spark clusters, Jupyter notebooks, training using ML Engine and scaling using Seldon on Kubernetes that I haven’t tried yet
Use dplyr R code across datasets including BigQuery (from https://rpubs.com/shivanandiyer/BigRQuery)
library(bigrquery) # R Interface to Google BigQuery API  
library(dplyr) # Grammar for data manipulation  
library(DBI) # Interface definition to connect to databases 
bq_conn <-  dbConnect(bigquery(), 
                      project = "project-id",
                      dataset = "dataset-id", 
                      use_legacy_sql = FALSE)
                      
bq_table <- dplyr::tbl(bq_conn, "my-table") library(googleCloudRunner) for latest thinking - https://code.markedmondson.me/googleCloudRunner/