The User Activity API lets you query an individual user’s movement through your website, by sending in the individual clientId or userId. It is accessed via the ga_clientid_activity() function.

Use Activity is available in version of googleAnalyticsR >= 0.6.9000 and needs googleAuthR >= 0.7.0.9000 so install via:

remotes::install_github("MarkEdmondson1234/googleAuthR")
remotes::install_github("MarkEdmondson1234/googleAnalyticsR")

User Activity API example

You first need to have a clientId or userId to query. You can now get this from the reporting API via the dimension ga:clientId.

Its also available via the User Explorer report in the Web UI, or via a BigQuery export, or you may be capturing the ID in a custom dimension.

Once you have an ID, specify the Google Analytics view that user was browsing and the data range of the activity you want to query:

Multiple ids

You can send in multiple IDs of the same type in a vector:

two_clientIds <- c("1106980347.1461227730", "476443645.1541099566")
two_users <- ga_clientid_activity(two_clientIds,
                                  viewId = 81416156, 
                                  date_range = c("2019-01-01","2019-02-01"))

Return format

The API returns two types of data: session level and activity hit level. Access it via $sessions or $hits:

The amount of data returned is rich for the activity, the data columns are shown below (Although some will be empty for some rows if not applicable).

The data.frames returned include the ID you sent in as the $id column so you can distinguish between users.

Nested columns

The output uses nested columns for some values so you may want to get familiar with the tidyr::unnest() function when working with the data.

The nested columns are hits$customDimension, hits$ecommerce and hits$goals.

The nesting is necessary as you can have multiple of these events per hit, and expanding them in the response would make a very large data.frame to work with.

An example on how to unnest goals is shown below:

To unnest custom dimensions, some example code is below:

library(tidyr) # devtools::install_github("tidyverse/tidyr")
library(purrr)
library(dplyr)

a_user$hits %>% 
  select(id, sessionId, activityTime, customDimension) %>% 
  unnest(customDimension) %>% 
  mutate(cd_index = map_chr(customDimension, "index"), 
         cd_value = map_chr(customDimension, ~ .$value %||% NA_character_)) %>%
  filter(!is.na(cd_value)) %>%
  select(-customDimension) %>%
  distinct() %>%
  pivot_wider(names_from = cd_index, values_from = cd_value, names_prefix = "customDim")

To unnest ecommerce and filter to only transactions, an example is shown below:

a_user$hits %>%
  filter(activityType == "ECOMMERCE") %>%
  select(id, sessionId, activityTime, ecommerce) %>%
  mutate(transaction = map(ecommerce, "transaction"),
         transactionRevenue = map_dbl(transaction, ~.[["transactionRevenue"]] %||% NA),
         transactionId = map_chr(transaction, ~.[["transactionId"]] %||% NA)) %>%
  filter(!is.na(transactionRevenue)) %>%
  select(-transaction, -ecommerce)

To get the traffic sources per hit, you only need the first hit per session so can compute via:

Filtering the response

If you specify the activity_type parameter, you can filter down the response to only the events you include in a vector.

The permitted types are: c("PAGEVIEW","SCREENVIEW","GOAL","ECOMMERCE","EVENT") - include some of these to specify which you would like to see.

Sampled response

The API response may be sampled - it will send a message if this happens. If it does, follow the advice on the API documentation such as splitting up the call into smaller date ranges.

Also bear in mind each API call counts against your Analytics Reporting v4 API quota which by default is 50k per day, so you won’t be able to fetch more user activity than that without increasing your API quota.