BigQuery + GA4: Page Navigation Report

If you've been working with GA4 for some time, you may have noticed that certain dimensions and metrics that were available in Universal Analytics are not present in GA4. For example, the Navigation report, where we could select a URL from our website and it would show the previous and next page paths in percentages, is no longer available:

 

UA - Exit pages-1

Neither of the two dimensions you see — Previous Page Path and Next Page — exist in GA4. However, the navigation report that existed in Universal Analytics was always very useful when we wanted to understand general behavior during navigation, focusing on specific content. It helps to understand user flow and improve the navigation experience or content strategy for the website.

In this post, we will explain how to generate this report from BigQuery.

Don't be scared! If you haven’t yet ventured into BigQuery, even though you already have the connection between GA4 and BigQuery set up, don’t worry, we will make it easy for you. You’ll only need to copy and paste the query we provide and make a few adjustments.

We’ll explain what each part of the function does, in case you're interested. But if not, just scroll down to the query section and copy it.

This BigQuery query uses data from Google Analytics 4 to analyze user navigation on a website, specifically around a given page URL. It is done in several stages:

"prep" subquery:

  • Selects 'user_pseudo_id' (an anonymous user identifier), 'session_id' (user session identifier), 'page' (URL of the page viewed), and 'event_timestamp' (the timestamp when the page view occurred).
  • Filters to include only page view events (event_name = 'page_view').
  • The data is extracted from the event tables for a specific date range in the Google Analytics 4 dataset (in this case, for the year 2023).

"prep_navigation" subquery:

  • Uses the 'prep' temporary table to retrieve a sequence of pages visited by each user and session.
  • Applies LAG and LEAD window functions to get the previous ('previous_page') and next ('next_page') pages, respectively, for each page view, ordered by 'event_timestamp' in ascending order. This is partitioned by 'user_pseudo_id' and 'session_id', meaning the page sequence is specific to each user's session.

Main query:

  • Replaces null previous and next pages with '(entrance)' and '(exit)' respectively, indicating that if there is no previous page, the page is the entry point, and if there is no next page, it is the exit.
  • Counts the number of unique sessions ('count') where a specific page has been visited, using 'COUNT(DISTINCT ...)' on the concatenation of 'user_pseudo_id' and 'session_id'.
  • Filters to show data only for the specific URL of the page you want to analyze.
  • Groups the results by 'previous_page', 'page', and 'next_page'.
  • Filters to ensure that the page is not the same as the previous or next page to avoid self-references.
  • Orders the results by 'count' in descending order.

Here’s the query you need:

WITH
  prep AS (
  SELECT
    user_pseudo_id,
    (
    SELECT
      value.int_value
    FROM
      UNNEST(event_params)
    WHERE
      event_name = 'page_view'
      AND KEY = 'ga_session_id') AS session_id,
    (
    SELECT
      value.string_value
    FROM
      UNNEST(event_params)
    WHERE
      event_name = 'page_view'
      AND KEY = 'page_location') AS page,
    event_timestamp
  FROM
    -- Aquí pon el nombre de tu conjunto de datos de GA4.En events_2023* puedes poner una fecha concreta: _20231001 (1 de octubre de 2023), _202310* (todo octubre de 2023), _2023* (todo lo que llevamos de 2023)...
    `<project>.<dataset>.events_2023*`
  WHERE
    event_name = 'page_view'
  ),
  prep_navigation AS (
  SELECT
    user_pseudo_id,
    session_id,
    LAG(page,1) OVER (PARTITION BY user_pseudo_id, session_id ORDER BY event_timestamp ASC)AS previous_page,
    page,
    LEAD(page,1) OVER (PARTITION BY user_pseudo_id, session_id ORDER BY event_timestamp ASC)AS next_page,
    event_timestamp
  FROM
    prep
  )
SELECT
  IFNULL(previous_page,'(entrance)') AS previous_page,
  page,
  IFNULL(next_page,'(exit)') AS next_page,
  COUNT(DISTINCT CONCAT(user_pseudo_id,session_id)) AS count
FROM
  prep_navigation
WHERE
  -- Copia y pega abajo la url de la página que quieres consultar.
  page = "https://www.hikeproject.com/como-visualizar-porcentajes-en-un-scorecard-de-data-studio/"
GROUP BY
  previous_page,
  page,
  next_page
HAVING
  page != previous_page
  AND page != next_page
ORDER BY
  count desc

To query for a specific date range, for example, from September 2nd to October 15th, you need to replace the query snippet:

sql
 

FROM
  -- Aquí pon el nombre de tu conjunto de datos de GA4.En events_2023* puedes poner una fecha concreta: _20231001 (1 de octubre de 2023), _202310* (todo octubre de 2023), _2023* (todo lo que llevamos de 2023)...
  `<project>.<dataset>.events_2023*`
WHERE
  event_name = 'page_view'),

Y lo sustituyes por: 


FROM
  `<project>.<dataset>.events_*`
WHERE
  event_name = 'page_view'
  AND _TABLE_SUFFIX BETWEEN '20230902'
  AND '20231015'),

And that’s it!

Once you run the query, you'll get a table like this:

(img: BQ - Query Results)

BQ - Query Results

The central column corresponds to the selected URL for analysis (Page), the left column shows the previous page, and the right column shows the next page.

This way, you can see how the post:

https://www.hikeproject.com/como-visualizar-porcentajes-en-un-scorecard-de-data-studio/

performed between September 2nd and October 15th:

With this type of report, you can not only analyze specific content on your website but also analyze user behavior in a process or task. For example, on a flight booking page, you can analyze what users do after performing a search, what percentage return to the homepage for another search, and what percentage move on to the next screen to choose a fare. It’s also very useful for analyzing the homepage of the website, especially for sites with multiple objectives on their homepage. Returning to the example of a flight booking page: searching for flights, checking in for a flight, finding information about an already purchased flight, or contacting for an issue or question.

Did you use the navigation reports in Universal Analytics? Were you missing them in GA4?

Hope this helps! Thanks for visiting our blog and using it as a source of knowledge for your daily learning.

ANTERIOR
SIGUIENTE

TIPS DE EXPERTOS

Suscríbete para impulsar tu negocio.

ÚLTIMOS ARTÍCULOS

Group your data like a pro: clustering with K-Means and BigQuery ML

Working with large volumes of marketing data—whether it’s web traffic, keywords, users, or campaigns—can feel overwhelming. These data sets often aren’t organized or categorized in a useful way, and facing them can feel like trying to understand a conversation in an unfamiliar language.

But what if you could automatically discover patterns and create data groups—without manual rules, endless scripts, or leaving your BigQuery analysis environment?

That’s exactly what K-Means with BigQuery ML allows you to do.

What is K-Means and why should you care?

K-Means is a clustering algorithm—a technique for grouping similar items. Imagine you have a table with thousands of URLs, users, or products. Instead of going through each one manually, K-Means can automatically find groups with common patterns: pages with similar performance, campaigns with similar outcomes, or users with shared behaviors.

And the best part? With BigQuery ML, you can apply K-Means using plain SQL—no need for Python scripts or external tools.

How does it actually work?

The process behind K-Means is surprisingly simple:

  1. You choose how many groups you want (the well-known “K”).

  2. The algorithm picks initial points called centroids.

  3. Each row in your data is assigned to the nearest centroid.

  4. The centroids are recalculated using the assigned data.

  5. This process repeats until the groups stabilize.

The result? Every row in your table is tagged with the cluster it belongs to. Now you can analyze the patterns of each group and make better-informed decisions.

How to apply it in BigQuery ML

BigQuery ML simplifies the entire process. With just a few lines of SQL, you can:

  • Train a K-Means model on your data

  • Retrieve the generated centroids

  • Classify each row with its corresponding cluster

This opens up a wide range of possibilities to enrich your dashboards and marketing analysis:

  • Group pages by performance (visits, conversions, revenue)

  • Detect behaviors of returning, new, or inactive users

  • Identify products often bought together or with similar buyer profiles

  • Spot keywords with unusual performance

How many clusters do I need?

Choosing the right number of clusters (“K”) is critical. Here are a few strategies:

  • Business knowledge: If you already know you have 3 customer types or 4 product categories, start there.

  • Elbow Method: Run models with different K values and watch for the point where segmentation no longer improves significantly.

  • Iterate thoughtfully: Test, review, and adjust based on how your data behaves.

Real-world examples

With K-Means in BigQuery, you can answer questions like:

  • What types of users visit my site, and how do they differ?

  • Which pages show similar performance trends?

  • Which campaigns are generating outlier results?

Grouping data this way not only saves time—it reveals opportunities and issues that might otherwise go unnoticed.

Conclusion

If you're handling large data sets and need to identify patterns fast, clustering with K-Means and BigQuery ML can be a game-changer. You don’t need to be a data scientist or build complex solutions from scratch. You just need to understand your business and ask the right questions—BigQuery can handle the rest.

Start simple: take your top-performing pages, group them by sessions and conversions, and see what patterns emerge. You might uncover insights that completely shift how you approach your digital strategy.

 

Claude 4.0: Advances and Challenges in Conversational AI

Artificial Intelligence (AI) continues to progress at an accelerated pace, and Claude 4.0, developed by Anthropic, marks a major milestone in this journey. This next-generation language model stands out for its ability to comprehend complex contexts, deliver accurate responses, and adapt to a wide range of business needs.

AlphaEvolve: The new coding agent powered by Gemini

In a world where technology advances at unprecedented speed, artificial intelligence has emerged as a key driver of transformation. Among the most promising innovations today is AlphaEvolve, an evolutionary coding agent that combines the creative power of large language models (LLMs) with automated evaluators, opening new frontiers in software development, algorithm optimization, and solving complex problems in mathematics and computing.

How AI Is Revolutionizing Design and Development

At its Config 2025 event, Figma made it clear: the future of digital design will be deeply shaped by artificial intelligence. Beyond announcing new features, the company highlighted a paradigm shift — design is no longer a standalone process, but the core that connects creativity, technology, and product development.

data
Mallorca 184, 08036
Barcelona, Spain