GA4 - How to clean URL parameters

As we explained in this other blog post sometimes url parameters can give us big problems with cardinality in data and especially now in GA4.

 

Remember that in GA4 high cardinality of data can cause you to run into lines like this in your reports both defaults and scans:

 

GA4 Cardinality

 

Cardinality is the number of unique values assigned to a dimension. There are dimensions with a fixed number of unique values (e.g. Weekday, Device) and others can have a high cardinality (e.g. page path, transaction ID). If a dimension takes more than 500 values per day, it will be considered by GA4 as a high cardinality dimension.

One of google's recommendations to avoid encountering cardinality is to "Limit the collection of data especially those with high cardinality (Ex: URLs or url parameters)". Ya.

But you will agree that many times that cardinality in urls comes from external agents that do not depend on us and that also coincide with ad platforms... Do these url parameters sound familiar to you?

fbclid=oiekhakjdvnkajsd089237498

creative=luesfglknadsnadsnlad989867983

If you run ad campaigns on Google Ads or Meta they are sure to ring a bell

We are going to explain you a way to clean up all those parameters of the urls.

 

The first thing is to identify those parameters.

    1. Create a new scan in GA4.

    2. Add the page path + query string dimension and the views metric.

    3. Apply a filter to the table where the page path dimension contains "?", by default the table displays 10 rows, extend the table to 50 or 100 rows.

GA4 - Exploration query parameters

Now that you know what parameters you want to exclude you go to GTM to the container that manages that property:

  1. Click on Templates and then Search Gallery. GTM - Template & Gallery

  2. In the gallery search engine type Trim Query and click on the solution that appears GTM - Search Template

     

  3. Click on Add to Workspace GTM - Add Template to WS

     

  4. Click Add GTM - Add Template confirm

     

Let's create a new variable:

  1. Go to variables and click on create new.

  2. Under variable type select Trim Query 

  3. Under target URL drop down and select Page URL. By default Queries is selected, if it is not, select this option.

  4. In Add Row add one by one the parameters you want to clean up from the list extracted from GA4.

  5. When you are done click Save. GTM - New variable

And now let's apply this variable. You have two options:

  • You go to each of the GA4 GTM tags and add a new parameter that overwrites the page_location with the variable you just created GTM - Add parameter to tag config
  • Create a new variable of type Tag Event Settings and apply it on all event tags . GTM - Tag events settings

And for the moment this is the best way to clean up those annoying urls that can also cause cardinality problems in reports.

And for the moment this is the best way to clean up those annoying urls that can also cause cardinality problems in reports.

If you want another day we will explain you how to clean this but in your Looker Studio reports.

ANTERIOR
SIGUIENTE

TIPS DE EXPERTOS

Suscríbete para impulsar tu negocio.

ÚLTIMOS ARTÍCULOS

Group your data like a pro: clustering with K-Means and BigQuery ML

Working with large volumes of marketing data—whether it’s web traffic, keywords, users, or campaigns—can feel overwhelming. These data sets often aren’t organized or categorized in a useful way, and facing them can feel like trying to understand a conversation in an unfamiliar language.

But what if you could automatically discover patterns and create data groups—without manual rules, endless scripts, or leaving your BigQuery analysis environment?

That’s exactly what K-Means with BigQuery ML allows you to do.

What is K-Means and why should you care?

K-Means is a clustering algorithm—a technique for grouping similar items. Imagine you have a table with thousands of URLs, users, or products. Instead of going through each one manually, K-Means can automatically find groups with common patterns: pages with similar performance, campaigns with similar outcomes, or users with shared behaviors.

And the best part? With BigQuery ML, you can apply K-Means using plain SQL—no need for Python scripts or external tools.

How does it actually work?

The process behind K-Means is surprisingly simple:

  1. You choose how many groups you want (the well-known “K”).

  2. The algorithm picks initial points called centroids.

  3. Each row in your data is assigned to the nearest centroid.

  4. The centroids are recalculated using the assigned data.

  5. This process repeats until the groups stabilize.

The result? Every row in your table is tagged with the cluster it belongs to. Now you can analyze the patterns of each group and make better-informed decisions.

How to apply it in BigQuery ML

BigQuery ML simplifies the entire process. With just a few lines of SQL, you can:

  • Train a K-Means model on your data

  • Retrieve the generated centroids

  • Classify each row with its corresponding cluster

This opens up a wide range of possibilities to enrich your dashboards and marketing analysis:

  • Group pages by performance (visits, conversions, revenue)

  • Detect behaviors of returning, new, or inactive users

  • Identify products often bought together or with similar buyer profiles

  • Spot keywords with unusual performance

How many clusters do I need?

Choosing the right number of clusters (“K”) is critical. Here are a few strategies:

  • Business knowledge: If you already know you have 3 customer types or 4 product categories, start there.

  • Elbow Method: Run models with different K values and watch for the point where segmentation no longer improves significantly.

  • Iterate thoughtfully: Test, review, and adjust based on how your data behaves.

Real-world examples

With K-Means in BigQuery, you can answer questions like:

  • What types of users visit my site, and how do they differ?

  • Which pages show similar performance trends?

  • Which campaigns are generating outlier results?

Grouping data this way not only saves time—it reveals opportunities and issues that might otherwise go unnoticed.

Conclusion

If you're handling large data sets and need to identify patterns fast, clustering with K-Means and BigQuery ML can be a game-changer. You don’t need to be a data scientist or build complex solutions from scratch. You just need to understand your business and ask the right questions—BigQuery can handle the rest.

Start simple: take your top-performing pages, group them by sessions and conversions, and see what patterns emerge. You might uncover insights that completely shift how you approach your digital strategy.

 

Claude 4.0: Advances and Challenges in Conversational AI

Artificial Intelligence (AI) continues to progress at an accelerated pace, and Claude 4.0, developed by Anthropic, marks a major milestone in this journey. This next-generation language model stands out for its ability to comprehend complex contexts, deliver accurate responses, and adapt to a wide range of business needs.

AlphaEvolve: The new coding agent powered by Gemini

In a world where technology advances at unprecedented speed, artificial intelligence has emerged as a key driver of transformation. Among the most promising innovations today is AlphaEvolve, an evolutionary coding agent that combines the creative power of large language models (LLMs) with automated evaluators, opening new frontiers in software development, algorithm optimization, and solving complex problems in mathematics and computing.

How AI Is Revolutionizing Design and Development

At its Config 2025 event, Figma made it clear: the future of digital design will be deeply shaped by artificial intelligence. Beyond announcing new features, the company highlighted a paradigm shift — design is no longer a standalone process, but the core that connects creativity, technology, and product development.

data
Mallorca 184, 08036
Barcelona, Spain