BigQuery + GA4: Page Navigation Report

If you've been messing around with GA4, you will have already noticed that there are certain dimensions and metrics that were in Universal Analytics, and that are not in GA4, for example, the Navigation report in which we chose a url of our website and indicated in percentages the path of the previous page and the path of the next page:

UA - Navigation Summary

None of the two dimensions you see: Previous Page Path and Next Page exist for GA4. However, this navigation report that existed for Universal Analytics has always been very useful when we want to know the general behavior during navigation focusing on a specific content. It serves to understand the user flow and improve the browsing experience or the content strategy of the website. 

In this post we are going to explain how to get this report from BigQuery

We don't want to scare you, if you have not yet thought of entering BigQuery to make any query even though you have already made the connection between GA4 and BigQuery, don't run away because in this post we are going to make it easy for you, you will only have to copy and paste the query that we are going to give you and change a number of things. 

Yes we are going to explain what each part of the function does in case you are interested, but if you are not just go down to where the query is and copy it. 

This BigQuery query uses Google Analytics 4 data to analyze user navigation on a website, specifically around a given page URL. It is performed in several stages:

1. "prep" sub-query:


  • Select 'user_pseudo_id' (an anonymous identifier for the user), 'session_id' (a user's session identifier), 'page' (the URL of the page viewed), and 'event_timestamp' (the time at which the page view occurred).
  • Filters to include only events that are pageviews (event_name = 'page_view').
  • Data is extracted from the event tables for a specific date range of the Google Analytics 4 dataset (in this case, for the year 2023).


2. "prep_navigation" sub-query:


  • Uses the temporary table 'prep' to get a sequence of pages visited by each user and session.
  • Apply the window functions LAG and LEAD to get the previous page ('previous_page') and next page ('next_page') respectively, for each page view, sorted by 'event_timestamp' in ascending order. This is partitioned by 'user_pseudo_id' and 'session_id', which means that the sequence of pages is specific to each session of a user.

3. Main query:

  • Replace null previous and next pages with '(entrance)' and '(exit)' respectively, which indicates that if there is no previous page, the page in question is the entrance to the site, and if there is no next page, it is the exit.
  • Count the number of unique sessions ('count') where the specific page has been visited, using 'COUNT(DISTINCT ..)' on the concatenation of 'user_pseudo_id' and 'session_id'.
  • Filter to display information only for the URL of the specific page you want to query.
  • .
  • Aggregate the results by 'previous_page', 'page' and 'next_page'.
  • Filter to ensure that the page is not equal to 'previous_page' or 'next_page' to not count self-references.
  • Sort the results by 'count' in descending order.

Attention. This is the query you need:


with prep as (


select

    user_pseudo_id,

   (select value.int_value from unnest(event_params) where event_name = 'page_view' and key = 'ga_session_id') as session_id,


   (select value.string_value from unnest(event_params) where event_name = 'page_view' and key = 'page_location') as page,


   event_timestamp

from

-- Here you put the name of your GA4 dataset.In events_2023* you can put a specific date: _20231001 (October 1, 2023), _202310* (all of October 2023), _2023* (all of 2023 so far)...

 

   ``tests-bigquery-351807.analytics_313378230.events_2023*``

where

    event_name = 'page_view'),

  

prep_navigation as (

select

    user_pseudo_id,

    session_id,

   lag(page,1)over(partition by user_pseudo_id,session_id order by event_timestamp asc)as previous_page,

    page,

   lead(page,1)over(partition by user_pseudo_id,session_id order by event_timestamp asc)as next_page,

   event_timestamp

from

    prep)

select

   ifnull(previous_page,'(entrance)')as previous_page,

    page,

   ifnull(next_page,'(exit)') as next_page,

   count(distinct concat(user_pseudo_id,session_id)) as count

from

        prep_navigation

where

 -- Copy and paste below the url of the page you want to consult.

 

    page = "https://www.hikeproject.com/como-visualizar-porcentajes-en-un-scorecard-de-data-studio/"

group by

    previous_page,

    page,

    next_page

having

   page != previous_page

   and page != next_page

order by

   count desc

desc

 

If you want a "closed" date range, for example from September 2 to October 15.

Take this snippet from the query above: 

from

from

-- Here you put the name of your GA4 dataset.In events_2023* you can put a specific date: _20231001 (October 1, 2023), _202310* (all of October 2023), _2023* (all of 2023 so far)...

 

   `tests-bigquery-351807.analytics_313378230.events_2023*`

where

    event_name = 'page_view'),

And you replace it with: 

from

 `tests-bigquery-351807.analytics_313378230.events_*``

where

event_name = 'page_view'

and _TABLE_SUFFIX between '20230902' and '20231015'),

And that's it!

When you run the query you would get a table like this: 

BQ - Query Results-1

The central column corresponds to the url selected to analyze (Page), the column to its left indicates the previous page and the column to the right, the next page.

The central column corresponds to the url selected to analyze (Page), the column to its left indicates the previous page and the column to the right, the next page.

This way we see that the post: 

https://www.hikeproject.com/como-visualizar-porcentajes-en-un-scorecard-de-data-studio/

 

In the period from September 2 to October 15 there have been:

With this type of reports you can not only make the analysis of a specific content of your website, they are also very useful to analyze user behavior in a process or a task, for example in a flight booking page, you can analyze what users do after performing a search, what percentage of them return to the home page perhaps to perform another search, what percentage of users go to the next screen to choose fares ...

It is also very useful to make an analysis of the home page of the website especially for websites that have more than one objective in their home, returning to the example of a flight booking page: search for flights, perform the check in of a flight of a reservation, find information about a flight already purchased, contact for an incidence or doubt...

Were you using the navigation reports in Universal Analytics, and were you missing them in GA4?

...

Were you missing them in GA4?

 

 

 

ANTERIOR
SIGUIENTE

TIPS DE EXPERTOS

Suscríbete para impulsar tu negocio.

ÚLTIMOS ARTÍCULOS

Group your data like a pro: clustering with K-Means and BigQuery ML

Working with large volumes of marketing data—whether it’s web traffic, keywords, users, or campaigns—can feel overwhelming. These data sets often aren’t organized or categorized in a useful way, and facing them can feel like trying to understand a conversation in an unfamiliar language.

But what if you could automatically discover patterns and create data groups—without manual rules, endless scripts, or leaving your BigQuery analysis environment?

That’s exactly what K-Means with BigQuery ML allows you to do.

What is K-Means and why should you care?

K-Means is a clustering algorithm—a technique for grouping similar items. Imagine you have a table with thousands of URLs, users, or products. Instead of going through each one manually, K-Means can automatically find groups with common patterns: pages with similar performance, campaigns with similar outcomes, or users with shared behaviors.

And the best part? With BigQuery ML, you can apply K-Means using plain SQL—no need for Python scripts or external tools.

How does it actually work?

The process behind K-Means is surprisingly simple:

  1. You choose how many groups you want (the well-known “K”).

  2. The algorithm picks initial points called centroids.

  3. Each row in your data is assigned to the nearest centroid.

  4. The centroids are recalculated using the assigned data.

  5. This process repeats until the groups stabilize.

The result? Every row in your table is tagged with the cluster it belongs to. Now you can analyze the patterns of each group and make better-informed decisions.

How to apply it in BigQuery ML

BigQuery ML simplifies the entire process. With just a few lines of SQL, you can:

  • Train a K-Means model on your data

  • Retrieve the generated centroids

  • Classify each row with its corresponding cluster

This opens up a wide range of possibilities to enrich your dashboards and marketing analysis:

  • Group pages by performance (visits, conversions, revenue)

  • Detect behaviors of returning, new, or inactive users

  • Identify products often bought together or with similar buyer profiles

  • Spot keywords with unusual performance

How many clusters do I need?

Choosing the right number of clusters (“K”) is critical. Here are a few strategies:

  • Business knowledge: If you already know you have 3 customer types or 4 product categories, start there.

  • Elbow Method: Run models with different K values and watch for the point where segmentation no longer improves significantly.

  • Iterate thoughtfully: Test, review, and adjust based on how your data behaves.

Real-world examples

With K-Means in BigQuery, you can answer questions like:

  • What types of users visit my site, and how do they differ?

  • Which pages show similar performance trends?

  • Which campaigns are generating outlier results?

Grouping data this way not only saves time—it reveals opportunities and issues that might otherwise go unnoticed.

Conclusion

If you're handling large data sets and need to identify patterns fast, clustering with K-Means and BigQuery ML can be a game-changer. You don’t need to be a data scientist or build complex solutions from scratch. You just need to understand your business and ask the right questions—BigQuery can handle the rest.

Start simple: take your top-performing pages, group them by sessions and conversions, and see what patterns emerge. You might uncover insights that completely shift how you approach your digital strategy.

 

Claude 4.0: Advances and Challenges in Conversational AI

Artificial Intelligence (AI) continues to progress at an accelerated pace, and Claude 4.0, developed by Anthropic, marks a major milestone in this journey. This next-generation language model stands out for its ability to comprehend complex contexts, deliver accurate responses, and adapt to a wide range of business needs.

AlphaEvolve: The new coding agent powered by Gemini

In a world where technology advances at unprecedented speed, artificial intelligence has emerged as a key driver of transformation. Among the most promising innovations today is AlphaEvolve, an evolutionary coding agent that combines the creative power of large language models (LLMs) with automated evaluators, opening new frontiers in software development, algorithm optimization, and solving complex problems in mathematics and computing.

How AI Is Revolutionizing Design and Development

At its Config 2025 event, Figma made it clear: the future of digital design will be deeply shaped by artificial intelligence. Beyond announcing new features, the company highlighted a paradigm shift — design is no longer a standalone process, but the core that connects creativity, technology, and product development.

data
Mallorca 184, 08036
Barcelona, Spain