- I'm Mark Edmondson (@HoloMarkeD)
- Englishman living in Copenhagen since 2010
- Data Insight Developer for IIH Nordic
- Google Developer Expert for Google Analytics
- RStudio Shiny Advocate
EARL London - 13th-15th September 2016
Source: oreilly.com
One reason is Google Chrome uses prefetch
Prefetch tag makes Chrome (and websites) faster by loading next page.
<link rel="dns-prefetch" href="//widget.com"> <link rel="preconnect" href="//cdn.example.com"> <link rel="prefetch" href="//example.com/next-page.html"> <link rel="prerender" href="//example.com/thankyou.html">
And we can use it too - can be inserted dynamically:
var hint = document.createElement("link") hint.setAttribute("rel","prerender") hint.setAttribute("href","next-page.html") document.getElementsByTagName("head")[0].appendChild(hint)
var hint = document.createElement("link") hint.setAttribute("rel","prerender") hint.setAttribute("href", "next-page.html") document.getElementsByTagName("head")[0].appendChild(hint)
Can we predict quick enough to dynamically add it to the page before a user clicks?
OpenCPU and Google Tag Manager
Extracting data per user using googleAnalyticsR
library(googleAnalyticsR) ga_auth() gaId <- xxxx # Your View ID ## In this case, dimension3 contains userId in format: ## u={cid}&t={hit-timestamp} raw <- google_analytics_4(gaId, date_range = c("2016-02-01","2016-02-01"), metrics = c("pageviews"), dimensions = c("dimension3", "pagePath"), max = -1)
Or extract via BigQuery if you have Google Analytics 360 via google_analytics_bq()
dimension3 | pagePath | pageviews |
---|---|---|
u=100116318.1454322382&t=1454322382033 | /example/809 | 1 |
u=100116318.1454322382&t=1454322412130 | /example/1212 | 1 |
u=100116318.1454322382&t=1454322431492 | /example/339 | 1 |
u=100116318.1454322382&t=1454322441120 | /example/1494 | 1 |
u=100116318.1454322382&t=1454322450156 | /example/339 | 1 |
u=100116318.1454322382&t=1454322461871 | /example/1703 | 1 |
cid | sessionLen | timestamp | pagePath | pageviews |
---|---|---|---|---|
1005103157.1454327958 | 2 | 2016-02-01 12:59:18 | /example/1 | 1 |
1005103157.1454327958 | 2 | 2016-02-01 13:02:42 | /example/155 | 1 |
1010303050.1454327644 | 2 | 2016-02-01 12:54:03 | /example/144 | 1 |
1010303050.1454327644 | 2 | 2016-02-01 13:00:03 | /example/80 | 1 |
1011007665.1454333263 | 2 | 2016-02-01 14:27:43 | /example/1359 | 1 |
Our model library markovchain
needs a vector of sequential pageviews per userId.
## for each cid, split pagePath in timestamp order sequenceVD <- processed %>% select(cid, timestamp, pagePath) %>% group_by(cid) %>% arrange(timestamp) %>% distinct(pagePath) %>% mutate(step = row_number(), n = n()) %>% arrange(cid) %>% filter(n > 1) %>% select(-n) %>% spread(step, pagePath)
cid | 1 | 2 | 3 | 4 |
---|---|---|---|---|
1000641120.1465683551 | /da/a-z/6236/2665 | /da/a-z/6236/2670 | NA | NA |
1001334948.1469706364 | /da/a-z/6236/2660 | /da/a-z/6236 | NA | NA |
1003589990.1471286236 | /da/a-z/6236/2707 | /da/a-z/6236/2660 | NA | NA |
1003723352.1470269948 | /da/a-z/6236/2707 | /da/a-z/6236/2660 | NA | NA |
1004139521.1469437411 | /da/a-z/6236/2660 | /da/a-z/6236/2707 | NA | NA |
1004647640.1468402554 | /da/a-z/6236/2678 | /da/a-z/6236/2714 | /da/a-z/6236/2670 | NA |
Create a Markov chain model of first order
library(markovchain) model <- markovchainListFit(sequenceVD[,2:10], name = "seq") ## save model for use on OpenCPU save(model, file="./data/model.rda")
Predictions now we have built the model object.
library(markovchain) ## make predictions predict(model$estimate, newdata = "/da/a-z/6236/2665") ## prediction output Sequence: /example/251
OpenCPU allows webhooks to Github: updates the model everytime you push to Github
Create a small custom package with the model data and the function to predict pageviews
predictNextPage <- function(current_url){ out <- try(predict(model, newdata = current_url), silent = TRUE) if(inherits(out, "try-error")){ out <- "None" } out }
An example library using the predict function and the model data is here:
https://github.com/MarkEdmondson1234/predictClickOpenCPU
This library is called on the OpenCPU server from Google Tag Manager
Test your OpenCPU calls here: https://public.opencpu.org/ocpu/test/
Google Tag Manager lets you deploy JavaScript to a website without needing IT resources.
Login via a web interface: https://www.google.com/analytics/tag-manager/
http://code.markedmondson.me/predictClickOpenCPU/example/107.html
In Chrome go to chrome://net-internals#prerender
Questions?