In this article, we’re going to learn how to create the rel canonical URL tag using Google Tag Manager, and how to insert it in every page of our website so that the correct canonical is automatically generated in each URL.
We’ll do it using Google Tag Manager and its variables.
Why send a canonical from each page to itself?
Javier Lorente gave us a very good explanation/reminder at the 2015 SEO Salad event in Zaragoza (Spain). In short, there may be various factors that cause Google to index unexpected variants of a URL, and this is often beyond our control:
- External pages that display our website but use another URL (e.g., Google’s own cache, other search engines and content aggregators, archive.org, etc.). This way, Google will know which one is the original page at all times.
- Parameters that are irrelevant to SEO/content such as certain filters and order sequences
By including this “standard” canonical in every URL, we are making it easy for Google to identify the original content.
How do we generate the dynamic value of the canonical URL?
To generate the canonical URL, dynamically we need to force it to always correspond to the “clean" (i.e., absolute, unique, and simplified) URL of each page (taking into account the www, URL query string parameters, anchors, etc.).
Remember that, in summary, the URL variables that can be created in GTM (Google Tag Manager) correspond to the following components:
We want to create a unique URL for each page, without queries or anchors. We need a “clean” URL variable, and we can’t use the {{Page URL}} built-in variable, for two reasons:
- Although fragment doesn’t form part of the URL by default, query string params does
- Potential problems with protocol and hostname, if different options are admitted (e.g., SSL and www)
Therefore, we need to combine Protocol + Host + Path into a single variable.
Now, let's take a step-by-step look at how to create our {{Page URL Canonical}} variable.
1. Create {{Page Protocol}} to compile the section of the URL according to whether it’s an https:// or https://
Note: We’re assuming that the entire website will always function under a single protocol. If that’s not the case, then we should substitute the {{Page Protocol}} variable for plain text in the final variable of Step #4. (This will allow us to force it to always be http/https, without exception.)
2. Create {{Page Hostname Canonical}}
We need a variable in which the hostname is always unique, whether or not it’s entered into the browser with the www. The hostname canonical must always be the same, regardless of whether or not it has the www. We can decide based on which one of the domains is redirected to the other, and then keep the original as the canonical.
How do we create the canonical domain?
- Option 2.1: Redirect the domain with www. to a domain without www. via 301
Our canonical URL is WITHOUT www. We need to create Page Hostname, but make sure we always remove the www: - Option 2.2: Redirect the domain without www. to a domain with www. via 301
Our canonical URL is WITH www. We need to create Page Hostname without www (like before), and then insert the www in front using a constant variable:
3. Enable the {{Page Path}} built-in variable
Note: Although we have the {{Page Hostname}} built-in variable, for this exercise it’s preferable not to use it, as we’re not 100% sure how it will behave in relation to the www (e.g., in this instance, it’s not configurable, unlike when we create it as a GTM custom variable).
4. Create {{Page URL Canonical}}
Link the three previous variables to form a constant variable:
{{Page Protocol}}://{{Page Hostname Canonical}}{{Page Path}}
Summary/Important notes:
- Protocol: returns http / https (without ://), which is why we enter this part by hand
- Hostname: we can force removal of the www. or not
- Path: included from the slash /. Does not include the query, so it's perfect. We use the built-in option for Page Path.
Now that we have created {{Page URL Canonical}}, we could even populate it into Google Analytics via custom dimensions. You can learn to do that in this Google Analytics custom dimensions guide.
How can we insert the canonical into a page using Tag Manager?
Let’s suppose we’ve already got a canonical URL generated dynamically via GTM: {{Page URL Canonical}}.
Now, we need to look at how to insert it into the page using a GTM tag. We should emphasize that this is NOT the “ideal” solution, as it’s always preferable to insert the tag into the <head> of the source code. But, we have confirming evidence from various sources that it DOES work if it’s inserted via GTM. And, as we all know, in most companies, the ideal doesn’t always coincide with the possible!
If we could insert content directly into the <head> via GTM, it would be sufficient to use the following custom HTML tag:
<link href=”{{Page URL Canonical}}” />
But, we know that this won’t work because the inserted content in HTML tags usually goes at the end of the </body>, meaning Google won’t accept or read a <link rel="canonical"> tag there.
So then, how do we do it? We can use JavaScript code to generate the tag and insert it into the <head>, as described in this article, but in a form that has been adapted for the canonical tag:
<script> var c = document.createElement('link'); c.; c.href = {{Page URL Canonical}}; document.head.appendChild(c); </script>
And then, we can set it to fire on the “All Pages” trigger. Seems almost too easy, doesn’t it?
How do we check whether our rel canonical is working?
Very simple: Check whether the code is generated correctly on the page.
How do we do that?
By looking at the DevTools Console in Chrome, or by using a browser plugin like like Firebug that returns the code generated on the page in the DOM (document object model). We won't find it in the source code (Ctrl+U).
Here’s how to do this step-by-step:
- Open Chrome
- Press F12
- Click on the first tab in the console (Elements)
- Press Ctrl+F and search for “canonical”
- If the URL appears in the correct form at the end of the <head>, that means the tag has been generated correctly via Tag Manager
That's it. Easy-peasy, right?
So, what are your thoughts?
Do you also use Google Tag Manager to improve your SEO? Why don’t you give us some examples of when it’s been useful (or not)?
Sadly, I've been testing GTM insertion of this as well as all the meta robots tags and even though you can see them in inspect element, the search engine won't necessarily pick them up (because of GTM's async javascript load). Sometimes it works, but it seems to really depend on how much other scripts (both GTM and hardcoded) are on the page.
Don't get me wrong I think it's a great idea and thanks for sharing, but so far the crawlers don't want to wait for all the JavaScript to load, which I can perfectly understand why.
True, TomislavL. Since the early days of canonical problems we've been using htaccess to control www vs non-www (and https if that was relevant).
That seems to have worked. Am not so sure about going to a lot of trouble to protect against other potential canonical problems. Or am I missing something here?
I've been experimenting with GTM data insertion as well. There are a couple roadblocks I've run into.
Have you experienced the same issues? If so, do you have a work-around yet?
Hi Logan Ray,
Thanks for your comment and contributions.
1. Please read this Simo Ahava's post and its comments (specially the last one from Simo).
https://www.simoahava.com/seo/dynamically-added-met....
I just can tell that in some projects we have made some tests with canonical and even with hreflang, and they did work ok. For example, as a result, there is no URL with parameters in Google index. In the simplest canonical version that is explained here, we have made this: clear the params from canonical URL automatically.
It is logic that hard coding a canonical tag works better. I think (and I've read) the correct or incorrect test result will depend on how much JavaScript the robots must crawl, as it is a Google's crawlers capability that is still more or less "recent". There is nothing clear about what the future of it will be.
This post has the intention of tell a little more about variables, utilities and possibilities of GTM.
2. Let me insist please: I don't recommend it as a best practise, just tell how to in case we need this as a last resource.
As has already been said is not the best G-friendly solution, BUT it's a one of those ideas that makes you think on the potential you can do, experiment, improve and achieve with GTM.
Totally thumbs up!!
Nik, agree its not Google friendly solution however it does the work. which it suppose to and i think sooner or later Google will consider its preference on this. As Google likes to control things in their end ( just assuming) :)
Actually i am just confused, I searched many of the website, but this tag i could find at all.
Totally agree!! It's not just about the exact topic of this post, but the ideas you may come up with after reading it.
Hi Lucia,
Canonicalization can be tricky, but you have covered most of the points very nicely along with google tag manager.
Really, I appreciate the article and the effort that went into writing it, a very well crafted one.
Regards,
Vijay
Hi TomislavL,
I agree with you, you are right that is not the "perfect" solution. This is just one possibility to use when it is suitable and there is no way to do so otherwise (ie. embedded in the actual HTML source code).
Thanks for you comment :))
I've tried to make this work, but it doesn't work for me. I made all the variables and the tag, but when I go to the website and press f12 I can't find the canonical (there is already one on the page, but there should be a second one, inserted with GTM). So do you know what I'm doing wrong? and just to be sure, this has to be the exact html code in the tag?: <script> var c = document.createElement('link'); c.rel 'canonical'; c.href = {{Page URL Canonical}}; document.head.appendChild(c); </script>
Hi Tom1996,
I'm not sure but it may not work beacuse there is already a canonical tag on the head. I don't know if page code will accept a second canonical < link > tag.
Could you try the implementation on another website without canonical please? if you don't have a site to test, you can try with "Tag Injector" plugin, which allow us debugging a GTM container elsewhere.
I think the code you pasted in your comment is correct but please check there is correct JavaScript code. It is important to check that quotation marks are straight ones and not oblique ones. Be careful too with line breaks and make sure there is a line break after each semicolon. That is to say:
<script>
var c = document.createElement('link');
c.; c.href = {{Page URL Canonical}};
document.head.appendChild(c);
</script>
*Please note that when we copy-paste JS code it could be damaged or changed, specially if the source is a .docx document or similar.
Please let me know if you discover the solution.
Thanks for your comment and best regards :)
Thanks everyone for your time :)
Let's continue having fun with GTM.
Soon there will be an English version of our Google Tag Manager & Analytics online course (cursogoogletagmanager.com).
If you would want to know more you can visit tagmanagercourse.com or contact me.
Thanks (or gracias) Lucia, i never heard about Google Tag Manager before, now i´m getting my hands dirty :)
This is perfect and seems like another step to take reliance off developers, my only concern is when it comes to paginated pages and query strings and you don’t want to list a different page as the canonical.
It is necessary that the canonical URL is in the DOM code if you want to dynamically generate it via GTM . I understand that you mean the case where the canonical URL can not be extracted from the variable { { Page URL} } , right? Thanks for your comment :)
Billian i just look in your perfil Lucia and i found your site and in your blog there is a lot of info about SEO
Thanks for your comments webtematica :)
I'm glad you like it. I write for luciamarin.es, aukera.es and aukera.co.uk.
Aukera's Blog is the result of team work. It is focused on GTM, CRO, UX and SEO.
My personal blog is basically about GTM and I use it to promote our online course.
Have a good day!
Lucía, you did it again! ;)
It's not a 100% Google friendly solution YET but it was funny watching it work as a part of our R&D experiments.
Thanks you, Bro Eneko for your words of support and your friendliness ;)
Lucía Marín
luciamarin.es
Hi Lucia,
Thats really interesting...thank you for sharing this. Its very helpful and quick way to implement the canonical. Frankly, i didn't know it before.
Maybe, we don’t know how many things could we make with Tag Manager and a good idea.
With GTM , you can since embeding a conversion code, to segment and separate visits that come from Google AdWords. Just with a param and GTM URL variables we can distinguish it.
Also, you can capture WordPress data such as pageType, archive page, tag, category… thanks to a GTM custom variable called “DOM Element”.
You can create as much variables as you need and you can use them since activating tags to collecting relevant DOM, URL, last event element info, ...
The DOM happens when all HTML code content has been load. At this point we can capture lots of data, in case we can not get configured as many dataLayers as we would like … of course I prefer dataLayer when it’s possible, this is again just an alternative way of doing things.
Kind regards :)
Lucia
I totally agree with you. on this there is much to explore. :)
Hi Lucía, no doubt there is more SEO possibilities in GTM but more exiting could be a dedicated SEO tool based on the same technology.
Best regards,
Chris
Undoubtedly a very complete tutorial, I would call "The Definitive Guide".
Excelente aporte, en este articulo he aprendido mucho sobre este tema del que hace mucho tiempo estaba buscando informacion.
Muchas gracias...
How does a canonical tag say that "this page is the original". I don't get it. If everyone implements a canonical that points to itself, including scrapper sites, Google’s cache, archive.org etc. as you mention, how would Google know what is the original simply by the canonical tag? One would think that Google would assume that the first page indexed is the original, no?
Hi dcrader,
Thanks for your comment and contribution to the post :)
You are right, sorry for this. We would need to create some GTM triggers' exceptions for this sites ( Google’s cache, archive.org etc.). And apply them to the tag with the simplest canonical mentioned on the post (just the URL without params, supposing this is suitable for your website as URL canonical pattern).
So, for https://mysite.com/?params....
the canonical will be https://mysite.com/
The exception trigger could be something like this: {{Page Hostname}} does not contain "mysite.com".
OK?
After this, we should create another "logic" and "dinamyc" canonical for those external sites that show our pages (in case it is possible to recover the real canonical from DOM code).
Best regards :)
Hmmm. Not sure I'm following. How would we have access to the 3rd party site's code? I know there is the rel=publisher tag to tell google who is the original source, but I just don't understand why each page would need a canonical to point to itself for the purpose of telling google which is the original?
If you have a page with parameters, then sure a canonical to the shortened url without the parameters make sense, but that's not what I'm talking about.
Hi again dcrader :)
In fact, we will have access to the 3rd party site's code, on some way... I will try to explain-me better... As the 3rd party site has embeded our website code, it has even embeded our GTM container code. Just because of this, with GTM and the correct variables and triggers, maybe we could create a real canonical URL for this cases (I would have to think more about them...).
Last week, I made some tests on it, and on Google cache page for a site with the general canonical rule mentioned on the post, the canonical failed but just because the dinamyc {{Page Hostname}} on cache pageview was Google hostname, not the website hostname. We could fix this issuu easily forcing a static hostname for this exception... We should test it case by case.
About this:
why each page would need a canonical to point to itself for the purpose of telling google which is the original?
Just to protect itself from duplicate content, in case Google has indexed some duplicate page variant by error.
Best regards :)
Lucia
Very interesting idea and good post! I love the idea of it but like everyone unsure if google will recognise it. Would be great for some situations when the developers just don't have the time to do it.
One issue I could see happening is if you're working with a CMS that uses Alias URLs. So therefor I wouldnt want it being placed on every page as this would end up with a serious duplicate content problem. I might look into a good way you could extend on this idea for page exceptions.
Hi Sally, thanks very much for your comment. This is an interesting contribution ;)
Of course, at post content I just mention the simpliest canonical example (just cleaning parameters) but there will be a lot of exception for this "bulk" rule.
Already mentioned on comments: Google cache/translate, redirections -as you mention-, etc. On those cases where the bulk rule does not work, we will have to study each one separately. And we will have to create new canonical rules that work for them (in case it is possible, that is to say, in case we can recover the correct canonical with GTM variables).
Best regards, nice to meet you, and thanks to everyone for reading and commenting this post so much :))
Uau! Google Tag Manager is the new tool that will overtake to Analytics. Thanks for the post! ;)
GTM is a wonder tool most specially for users who have little knowledge and are afraid to edit the source code. I don't use it because I didn't know this kind of tool exist. Though I have a bit of HTML skill, I will try this because I think it can really save time for us marketers. Thanks for the article. Another gem in my box!
Thanks for this, it is what I was looking for as well. Maybe you could include more examples in the post itself.
Nice one!
Clever use of GTM. Thanks for shaing .
Thanks for sharing this informative post. A Complete Guide about how to generate and insert rel canonical with Google tag manager..........
Thank you so much Lucia for this post.
Nice post Lucia. We implemented dynamic canonical for our dynamic website, but somehow it didn't work & created numerous errors, so we removed it. After reading this post we want to implement it again. Thanks in advance.
I'm yet to use GTM but will keep this in mind.
Hello Lucia,
Really great article. Because, on big eCommerce site, like magento and wordpress. They apply lot's of filters to show different variation. So, canonical is best solution to address those issue.
As my thinking(not sure), if we instruction to google bot through crawl-delay:20 in robots.txt. Means, inject the code in respective pages then google bot see it and crawl it.
Let me know, your suggestion, Lucia.
Thank You...!!!!
Thanks for your comment, Pintu Dabhi! ;) I am not very sure, but It sounds OK, good idea!
Maybe someone here can tell us more about this option?
Kind regards :)
Very helpful and worth reading it. I think wordpress idea is batter for me :)
Thanks for great post.
Canonicalization has always been a headache for me. I don't get to understand why so easy concept troubles me so much. Hope that this way I finally have a grip on it :)
Thanks!
Thanks Nik_Oppes and Ikkie and everyone for likes, sharing and commenting :)
First, thanks for providing useful information and i have concern also regarding canonical tag where i used canonical tag with my website but when i check into seo tool then it's showing error with canonical tag. Can you explain what is the reason behind it?
I think this is because canonical tag is dinamically embeded and not on the original source code. It is on DOM code but not on page code.
So, I would like to repeat again this is not the perfect solution, it is just an adittional resource for some cases in which we are not able to embed canonical on programming code.
Thanks for your comment :)
Wow, this is your first post! Congratulations!
Good and useful read - this is what we all need on Moz.
ATB,
PopArt Studio
Thanks very much for your words PopArt-Studio :))
Hi Lucía
Generate and insert rel canonical with GTM may not be the best solution today, but it is a step that we anticipate.
Thank you very much for the information
Thanks for this, Lucía! I think there might be a slight error in your code for the custom HTML tag. You need to declare the rel="canonical" link attribute: line 3 should be:
c.rel = 'canonical';
You've got:
c.;
So the whole tag should look like this:
<script>
var c = document.createElement('link');
c.rel = 'canonical';
c.href = {{Page URL Canonical}};
document.head.appendChild(c);
</script>
Your screenshot doesn't match the code you've got on the page.
Could you perhaps use a syntax highlighter in the body of the blog post to make it easier to copy your custom HTML tag?
Thanks!
Yours aye,
Chris
I am experiencing a problem with me SEO landingpages that the URLs with capital letters give 404's and without goes to the right page. Example:
https://www.mylocals.nl/content/eten/spare-ribs-be...
https://www.mylocals.nl/content/eten/spare-ribs-be...
Will this method also fix this problem?
I’m glad you liked this post. Using these search operators can really simplify your search results.
Very informative post.. Thanks for sharing..
Really Good share, Thanks and I just want to how you spending time to get some new content to your blog.
Thanks Vivek Ravi, sometimes it is difficult to get some "free" time to do it.
It is extra work but I like it and the GTM online course is a good source of ideas to writing posts.
Best regards
Lucia
Gracias por la mención Lucía! Mi experiencia me dice que al menos de momento todo lo que sea más allá de JSON LD, no funciona muy muy fino lo que le metes vía GTM. Ni titles, ni canonicals.. A veces va, a veces no. Más adelante no tengo duda de que irá siempre, pero a día de hoy no todos los crawlers lo entienden.
Thanks for sharing, very informative and clever use of GT
Google Tag Manager makes things much more simpler and easier. This tutorial has been really helpful. Great post shared Lucia!!
Hello Lucia,
It's so interesting and today I am learning something new through this article. I will try it first and contact you soon if any problem.
Thanks for the post Lucia.
Very informative post.. Thanks for sharing..