Contents

IntroductionFuzzy MatchingBranded SearchesSummaryCore Web Vitals Hub

Branded vs Non-Branded keywords in Search Console

In the previous post in this series on SEO data and reporting, we exported Google Search Console data and took it for a spin in Google Sheets to find Winning vs. Losing search queries in the past 28 days versus previous 28 days. This helped us understand what queries drove more clicks, but also the queries where we lost clicks.  

Now, I want to find out the patterns of organic visitors who came in intentionally with my brand name in mind. Taking the ‘Woorank’ brand as an example, I would look at search queries specifically mentioning said brand name:

  • Woorank
  • How to use Woorank?
  • Woorank pricing
  • Woorank vs X

But that’s only part of the story. Humans are not perfect, and might mistype or mispronounce your brand:

  • Woo rank
  • Whorank
  • Worank

It is especially important to capture these search queries as well, when analyzing search intent and extracting brand searches. To take it even further, you might want to target specific content pieces to target specific misspellings of your brand name if it turns out they are often occurring. After all your brand is your most precious asset, and branded searches often have a high likelihood of converting future purchases.

Fuzzy Matching

Because there is no simple and deterministic way to extract branded searches, with all variations and combinations. We want to be able to match our brand name ‘Woorank’ with search queries that are approximately similar but not an exact match. Many sources online will explain you how to use janky filters or regular expressions, but there is a better way: Artificial Intelligence. By applying Machine Learning, more specifically ‘Fuzzy Matching’ algorithms, we will be able to automatically capture branded search terms and their variations.

We will focus our efforts on the Levenshtein algorithm, and I will spare the mathematics, but in short this algorithm takes 2 words as its input, then checks how many single character insertions, deletions or substitutions are required to change one word into the second and returns the total number, also called ‘distance’, as its output. Does the game ‘Wordle’ ring a bell?

A couple of examples:

There are several resources explaining the algorithm in greater detail, so there is no reason for us to rehash these, but here are a few takeaways to consider:

  • The lower the Levenshtein distance between 2 words, the more the 2 words look alike
  • It is called ‘Fuzzy matching’ for a reason, so it is highly likely to get false-positives or negatives. Keep a critical eye on your results.

Extract Branded Searches from Search Console

Enough with the theory, let’s get down to business! In the previous article, we provided you with the formulas to copy/paste into an empty sheet, but given the larger number of areas and formulas, we’re going to start with a template you can copy and play around with.

In the context of this article, the ‘BRANDED’ sheet is the one to select, but given this template uses sample data, we invite you to copy your own Search Console exported queries data in the ‘GSC - Queries’ sheet.

Then fill in your brand name. The matching search queries in your Search Console data will show up underneath, next to ‘Branded Queries’. There are several factors that can influence the effectiveness of the Levenshtein algorithm, for example how long your brand name is, so in some cases it might help to tweak the sensitivity (this is the maximum Levenshtein distance allowed for 2 words to be considered equal, as explained above).  

Start at a value of 1, check the resulting matches, and increase the value by 1 to compare the results. If results improve, keep repeating the process, if not, stop and reduce again. Every situation is different, so there is no magic number, but generally a value of 1 or 2 should work in most cases.

Now that we have extracted the ‘Branded Queries’, we can bring these into formulas to analyze our search data. The branded queries cell lists the search queries separated by a ‘|’, which is the equivalent of ‘OR’, and can be used in QUERY formula:

=QUERY( 
  'GSC - Queries'!A1:I,  
  "SELECT A, B, D  
   WHERE A matches '" & B7 & "'  
    LABEL A 'Branded Queries'" 
) 

And by now, we are getting the hang of it, so why not bring it all together in an aggregated report, summing things together:

=QUERY('GSC - Queries'!A1:I, "SELECT SUM(B), SUM(D) WHERE A matches '" & B7 & "' LABEL SUM(B) 'Clicks', SUM(D) 'Impr'") 

Summary

In the first post of this series, we dipped our toes in the water by bringing our Search Console data into Google Sheets and analyzing it in a unique way. In this second instalment of the series, you learned how to apply simple Machine Learning algorithms, like Levenshtein, to extract branded searches and compare them against non-branded searches. Stay tuned for the next part of this series, where we will throw Google Data Studio into the mix and step it up a notch, building even better reports!  

Make sure to subscribe to our newsletter to be notified when this article drops.

Start your journey

What's your website's SEO score? Start your review to discover how WooRank can boost your online presence and help fix your website issues.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Start your free 14-day trial

Choose your plan

Resources to get you started