BigQuery + GA4: Output Pages Report

Remember from Universal Analytics the output reports? 

It was a default report that was within the behavioral heading under Content or Site content in which this table

was displayed.

It was a report of the main exit pages of our website, it is the report that is on the other side of the report of entry pages or landing pages. 

It was a report of the main exit pages of our website, it is the report that is on the other side of the report of entry pages or landing pages.

In the report as its name indicated it showed the main urls by which users leave our website.

UA - Exit pagesThe purpose of this query was to show the main urls by which users leave our website.

The purpose of this query is to identify which are the most common exit pages on a website, which can provide valuable information about user behavior, potential friction points in website design or user flow, and opportunities to improve user retention or optimize conversions.

We also provide you with a "UA - Exit pages" alt.

We also provide a description of what the query does in case you want to know, but you can skip the explanation and copy and paste the query directly.

This BigQuery query is designed to identify the exit pages of user sessions on a website using Google Analytics 4 data exported to BigQuery. Here's the breakdown of what the query does:

  • Subquery "prep":
      • Select user identifiers ('user_pseudo_id'), session identifiers ('session_id'), and the URLs of the pages ('page') where a page view event ('page_view') has occurred.
      • Pick up the 'event_timestamp' for each page view event.
      • Filters to include only events that are pageviews (event_name = 'page_view').
      • Sort the results by 'event_timestamp' to get a sequence of events by time.
  • Sub-query "prep_exit":
      • Use the temporary table 'prep' to select the same fields.
      • Apply the window function 'FIRST_VALUE' to get the last page viewed ('exit_page') in each session per user, sorting the events by 'event_timestamp' in a top-down manner, which means it gets the page with the last timestamp as the exit page.
  • Main inquiry:
    • Select the page ('exit_page') only if it matches the last page viewed in the session (indicating that it is the exit page).
    • Count the number of unique sessions ('exits') that ended at each specific exit page, using 'COUNT(DISTINCT ..') on the concatenation of 'user_pseudo_id' and 'session_id'..
    • Group the results by 'exit_page'..
    • Filters to include only those rows where 'exit_page' is not null, ensuring only exit pages are counted.
    • Sort the results by the number of exits in descending order.

We are going to pass you the query you have to use in Bigquery on your GA4 dataset or dataset of your website in order to see this type of report.

Remember that to do this you don't need to know Bigquery, just copy and paste what follows.


WITH prep AS (
  SELECT
    user_pseudo_id,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE event_name = 'page_view'AND KEY = 'ga_session_id') AS session_id,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE event_name = 'page_view'AND KEY = 'page_location') AS page,
    event_timestamp
  FROM
    -- In events_2023* you can put a specific date: _20231001 (October 1, 2023), _202310* (all of October 2023), _2023* (all of October 2023), _2023* (all of 2023 so far)....
    `<project>.<dataset>.events_2023*``
  WHERE
    event_name = 'page_view'
  ORDER BY
    event_timestamp
),
  prep_exit AS(
  SELECT
    user_pseudo_id,
    session_id,
    page,
    event_timestamp,
    FIRST_VALUE(CONCAT(page, event_timestamp)) OVER (PARTITION BY user_pseudo_id, session_id ORDER BY event_timestamp DESC) AS exit_page
  FROM
    rep
  ORDER BY
    event_timestamp
  )
SELECT
  CASE
    WHEN concat (page, event_timestamp) = exit_page THEN page
  ELSE
  NULL
END
  AS exit_page,
  COUNT(DISTINCT CONCAT(user_pseudo_id,session_id)) AS exits
FROM
  prep_exit
GROUP BY
  exit_page
HAVING
  exit_page IS NOT NULL
ORDER BY
  exits desc

If you want a "closed" date range, for example from August 8 to September 23. 

Take this snippet from the query above: 


FROM
  -- Here you put the name of your GA4 dataset.In events_2023* you can put a specific date: _20231001 (October 1, 2023), _202310* (all of October 2023), _2023* (all of 2023 so far)....
  `<project>.<dataset>.events_2023*``
WHERE
  event_name = 'page_view'

And you replace it with: 


FROM
  `<project>.<dataset>.events_*``
WHERE
  event_name = 'page_view'
  AND _TABLE_SUFFIX BETWEEN '20230808'
  AND '20230923'

And that's it!

When you run the query you would get a table like this:

BQ - Query Results

In the table we can already see the main exit pages of our website and the number of times that users have exited through them in the selected period.

This is the end of today's article. I hope it will be useful and helpful.

ANTERIOR
SIGUIENTE

TIPS DE EXPERTOS

Suscríbete para impulsar tu negocio.

ÚLTIMOS ARTÍCULOS

Group your data like a pro: clustering with K-Means and BigQuery ML

Working with large volumes of marketing data—whether it’s web traffic, keywords, users, or campaigns—can feel overwhelming. These data sets often aren’t organized or categorized in a useful way, and facing them can feel like trying to understand a conversation in an unfamiliar language.

But what if you could automatically discover patterns and create data groups—without manual rules, endless scripts, or leaving your BigQuery analysis environment?

That’s exactly what K-Means with BigQuery ML allows you to do.

What is K-Means and why should you care?

K-Means is a clustering algorithm—a technique for grouping similar items. Imagine you have a table with thousands of URLs, users, or products. Instead of going through each one manually, K-Means can automatically find groups with common patterns: pages with similar performance, campaigns with similar outcomes, or users with shared behaviors.

And the best part? With BigQuery ML, you can apply K-Means using plain SQL—no need for Python scripts or external tools.

How does it actually work?

The process behind K-Means is surprisingly simple:

  1. You choose how many groups you want (the well-known “K”).

  2. The algorithm picks initial points called centroids.

  3. Each row in your data is assigned to the nearest centroid.

  4. The centroids are recalculated using the assigned data.

  5. This process repeats until the groups stabilize.

The result? Every row in your table is tagged with the cluster it belongs to. Now you can analyze the patterns of each group and make better-informed decisions.

How to apply it in BigQuery ML

BigQuery ML simplifies the entire process. With just a few lines of SQL, you can:

  • Train a K-Means model on your data

  • Retrieve the generated centroids

  • Classify each row with its corresponding cluster

This opens up a wide range of possibilities to enrich your dashboards and marketing analysis:

  • Group pages by performance (visits, conversions, revenue)

  • Detect behaviors of returning, new, or inactive users

  • Identify products often bought together or with similar buyer profiles

  • Spot keywords with unusual performance

How many clusters do I need?

Choosing the right number of clusters (“K”) is critical. Here are a few strategies:

  • Business knowledge: If you already know you have 3 customer types or 4 product categories, start there.

  • Elbow Method: Run models with different K values and watch for the point where segmentation no longer improves significantly.

  • Iterate thoughtfully: Test, review, and adjust based on how your data behaves.

Real-world examples

With K-Means in BigQuery, you can answer questions like:

  • What types of users visit my site, and how do they differ?

  • Which pages show similar performance trends?

  • Which campaigns are generating outlier results?

Grouping data this way not only saves time—it reveals opportunities and issues that might otherwise go unnoticed.

Conclusion

If you're handling large data sets and need to identify patterns fast, clustering with K-Means and BigQuery ML can be a game-changer. You don’t need to be a data scientist or build complex solutions from scratch. You just need to understand your business and ask the right questions—BigQuery can handle the rest.

Start simple: take your top-performing pages, group them by sessions and conversions, and see what patterns emerge. You might uncover insights that completely shift how you approach your digital strategy.

 

Claude 4.0: Advances and Challenges in Conversational AI

Artificial Intelligence (AI) continues to progress at an accelerated pace, and Claude 4.0, developed by Anthropic, marks a major milestone in this journey. This next-generation language model stands out for its ability to comprehend complex contexts, deliver accurate responses, and adapt to a wide range of business needs.

AlphaEvolve: The new coding agent powered by Gemini

In a world where technology advances at unprecedented speed, artificial intelligence has emerged as a key driver of transformation. Among the most promising innovations today is AlphaEvolve, an evolutionary coding agent that combines the creative power of large language models (LLMs) with automated evaluators, opening new frontiers in software development, algorithm optimization, and solving complex problems in mathematics and computing.

How AI Is Revolutionizing Design and Development

At its Config 2025 event, Figma made it clear: the future of digital design will be deeply shaped by artificial intelligence. Beyond announcing new features, the company highlighted a paradigm shift — design is no longer a standalone process, but the core that connects creativity, technology, and product development.

data
Mallorca 184, 08036
Barcelona, Spain