How to Use Server Log Analysis for Technical SEO

By: Samuel Scott - @samueljscott May 26th, 2015

How to Use Server Log Analysis for Technical SEO

Technical SEO | Analytics | Search Engines

The author's views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

It's ten o'clock. Do you know where your logs are?

I'm introducing this guide with a pun on a common public-service announcement that has run on late-night TV news broadcasts in the United States because log analysis is something that is extremely newsworthy and important.

If your technical and on-page SEO is poor, then nothing else that you do will matter. Technical SEO is the key to helping search engines to crawl, parse, and index websites, and thereby rank them appropriately long before any marketing work begins.

The important thing to remember: Your log files contain the only data that is 100% accurate in terms of how search engines are crawling your website. By helping Google to do its job, you will set the stage for your future SEO work and make your job easier. Log analysis is one facet of technical SEO, and correcting the problems found in your logs will help to lead to higher rankings, more traffic, and more conversions and sales.

Here are just a few reasons why:

Too many response code errors may cause Google to reduce its crawling of your website and perhaps even your rankings.
You want to make sure that search engines are crawling everything, new and old, that you want to appear and rank in the SERPs (and nothing else).
It's crucial to ensure that all URL redirections will pass along any incoming "link juice."

However, log analysis is something that is unfortunately discussed all too rarely in SEO circles. So, here, I wanted to give the Moz community an introductory guide to log analytics that I hope will help. If you have any questions, feel free to ask in the comments!

What is a log file?

Computer servers, operating systems, network devices, and computer applications automatically generate something called a log entry whenever they perform an action. In a SEO and digital marketing context, one type of action is whenever a page is requested by a visiting bot or human.

Server log entries are specifically programmed to be output in the Common Log Format of the W3C consortium. Here is one example from Wikipedia with my accompanying explanations:

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

127.0.0.1 -- The remote hostname. An IP address is shown, like in this example, whenever the DNS hostname is not available or DNSLookup is turned off.
user-identifier -- The remote logname / RFC 1413 identity of the user. (It's not that important.)
frank -- The user ID of the person requesting the page. Based on what I see in my Moz profile, Moz's log entries would probably show either "SamuelScott" or "392388" whenever I visit a page after having logged in.
[10/Oct/2000:13:55:36 -0700] -- The date, time, and timezone of the action in question in strftime format.
GET /apache_pb.gif HTTP/1.0 -- "GET" is one of the two commands (the other is "POST") that can be performed. "GET" fetches a URL while "POST" is submitting something (such as a forum comment). The second part is the URL that is being accessed, and the last part is the version of HTTP that is being accessed.
200 -- The status code of the document that was returned.
2326 -- The size, in bytes, of the document that was returned.

Note: A hyphen is shown in a field when that information is unavailable.

Every single time that you -- or the Googlebot -- visit a page on a website, a line with this information is output, recorded, and stored by the server.

Log entries are generated continuously and anywhere from several to thousands can be created every second -- depending on the level of a given server, network, or application's activity. A collection of log entries is called a log file (or often in slang, "the log" or "the logs"), and it is displayed with the most-recent log entry at the bottom. Individual log files often contain a calendar day's worth of log entries.

Accessing your log files

Different types of servers store and manage their log files differently. Here are the general guides to finding and managing log data on three of the most-popular types of servers:

What is log analysis?

Log analysis (or log analytics) is the process of going through log files to learn something from the data. Some common reasons include:

Development and quality assurance (QA) -- Creating a program or application and checking for problematic bugs to make sure that it functions properly
Network troubleshooting -- Responding to and fixing system errors in a network
Customer service -- Determining what happened when a customer had a problem with a technical product
Security issues -- Investigating incidents of hacking and other intrusions
Compliance matters -- Gathering information in response to corporate or government policies
Technical SEO -- This is my favorite! More on that in a bit.

Log analysis is rarely performed regularly. Usually, people go into log files only in response to something -- a bug, a hack, a subpoena, an error, or a malfunction. It's not something that anyone wants to do on an ongoing basis.

Why? This is a screenshot of ours of just a very small part of an original (unstructured) log file:

Ouch. If a website gets 10,000 visitors who each go to ten pages per day, then the server will create a log file every day that will consist of 100,000 log entries. No one has the time to go through all of that manually.

How to do log analysis

There are three general ways to make log analysis easier in SEO or any other context:

Do-it-yourself in Excel
Proprietary software such as Splunk or Sumo-logic
The ELK Stack open-source software

Tim Resnik's Moz essay from a few years ago walks you through the process of exporting a batch of log files into Excel. This is a (relatively) quick and easy way to do simple log analysis, but the downside is that one will see only a snapshot in time and not any overall trends. To obtain the best data, it's crucial to use either proprietary tools or the ELK Stack.

Splunk and Sumo-Logic are proprietary log analysis tools that are primarily used by enterprise companies. The ELK Stack is a free and open-source batch of three platforms (Elasticsearch, Logstash, and Kibana) that is owned by Elastic and used more often by smaller businesses. (Disclosure: We at Logz.io use the ELK Stack to monitor our own internal systems as well as for the basis of our own log management software.)

For those who are interested in using this process to do technical SEO analysis, monitor system or application performance, or for any other reason, our CEO, Tomer Levy, has written a guide to deploying the ELK Stack.

Technical SEO insights in log data

However you choose to access and understand your log data, there are many important technical SEO issues to address as needed. I've included screenshots of our technical SEO dashboard with our own website's data to demonstrate what to examine in your logs.

Bot crawl volume

It's important to know the number of requests made by Baidu, BingBot, GoogleBot, Yahoo, Yandex, and others over a given period time. If, for example, you want to get found in search in Russia but Yandex is not crawling your website, that is a problem. (You'd want to consult Yandex Webmaster and see this article on Search Engine Land.)

Response code errors

Moz has a great primer on the meanings of the different status codes. I have an alert system setup that tells me about 4XX and 5XX errors immediately because those are very significant.

Temporary redirects

Temporary 302 redirects do not pass along the "link juice" of external links from the old URL to the new one. Almost all of the time, they should be changed to permanent 301 redirects.

Crawl budget waste

Google assigns a crawl budget to each website based on numerous factors. If your crawl budget is, say, 100 pages per day (or the equivalent amount of data), then you want to be sure that all 100 are things that you want to appear in the SERPs. No matter what you write in your robots.txt file and meta-robots tags, you might still be wasting your crawl budget on advertising landing pages, internal scripts, and more. The logs will tell you -- I've outlined two script-based examples in red above.

If you hit your crawl limit but still have new content that should be indexed to appear in search results, Google may abandon your site before finding it.

Duplicate URL crawling

The addition of URL parameters -- typically used in tracking for marketing purposes -- often results in search engines wasting crawl budgets by crawling different URLs with the same content. To learn how to address this issue, I recommend reading the resources on Google and Search Engine Land here, here, here, and here.

Crawl priority

Google might be ignoring (and not crawling or indexing) a crucial page or section of your website. The logs will reveal what URLs and/or directories are getting the most and least attention. If, for example, you have published an e-book that attempts to rank for targeted search queries but it sits in a directory that Google only visits once every six months, then you won't get any organic search traffic from the e-book for up to six months.

If a part of your website is not being crawled very often -- and it is updated often enough that it should be -- then you might need to check your internal-linking structure and the crawl-priority settings in your XML sitemap.

Last crawl date

Have you uploaded something that you hope will be indexed quickly? The log files will tell you when Google has crawled it.

Crawl budget

One thing I personally like to check and see is Googlebot's real-time activity on our site because the crawl budget that the search engine assigns to a website is a rough indicator -- a very rough one -- of how much it "likes" your site. Google ideally does not want to waste valuable crawling time on a bad website. Here, I had seen that Googlebot had made 154 requests of our new startup's website over the prior twenty-four hours. Hopefully, that number will go up!

As I hope you can see, log analysis is critically important in technical SEO. It's eleven o'clock -- do you know where your logs are now?

Additional resources

Log File Analysis: The Most-Powerful Tool in Your SEO Toolkit (Tom Bennet at BrightonSEO)
SEO Finds in Your Server Log (part two) (Tim Resnik on Moz)
Googlebot Crawl Issue Identification Through Server Logs (David Sottimano on Moz)
More information on the Logstash and Kibana parts of the ELK Stack (Logz.io)

About SamuelScott —

A former journalist, newspaper editor, and director of marketing and communications in the high-tech industry, Samuel Scott is now a global marketing speaker and writer of the regular “The Promotion Fix” column in The Drum in which he discusses integrated traditional and digital marketing. Follow him on Twitter and Facebook.

Comments 48

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Andy Drinkwater

2015-05-26T01:39:40-07:00

What a fantastic post Samuel.

So few SEO's have practical guides for log file analysis, and although I know there are some out there, this is about as comprehensive as I have read. There is nothing more that I can add to this as it is pretty much what I already do.

About to share on Twitter for you too :)

-Andy

6 0

What a fantastic post Samuel. So few SEO's have practical guides for log file analysis, and although I know there are some out there, this is about as comprehensive as I have read. There is nothing more that I can add to this as it is pretty much what I already do. About to share on Twitter for you too :) -Andy
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T10:44:44-07:00
 
 Thanks for the nice comment -- and the Twitter share!
 
 For any marketing tool or process, there are almost an infinite number of potential uses, so I'm happy to hear from another log analyzer that I covered them all. I was worried I might have missed a few! :)
 
 1 0
 
 Thanks for the nice comment -- and the Twitter share! For any marketing tool or process, there are almost an infinite number of potential uses, so I'm happy to hear from another log analyzer that I covered them all. I was worried I might have missed a few! :)
 Cancel
- John den Haan
 
 2015-05-26T18:54:02-07:00
 
 Hello Andy and Samuel,
 
 First of all, I totally agree that this is a great post. Whilst there's a lot of information you can gather from Google Analytics and the Webmaster Tools ("search console" sorry), sometimes it's good to get your hands dirty and get down to the real nitty-gritty. It's amazing what you can find.
 
 One guide which I've used before is this 2009 webmaster world thread on "log walking": https://www.webmasterworld.com/google_adsense/3830... The thread is old, but the practice and principle of digital dumpster diving" still hold true.
 
 1 0
 
 Hello Andy and Samuel, First of all, I totally agree that this is a great post. Whilst there's a lot of information you can gather from Google Analytics and the Webmaster Tools ("search console" sorry), sometimes it's good to get your hands dirty and get down to the real nitty-gritty. It's amazing what you can find. One guide which I've used before is this 2009 webmaster world thread on "log walking": <a href="https://www.webmasterworld.com/google_adsense/3830557.htm#msg3831856" rel="nofollow">https://www.webmasterworld.com/google_adsense/3830...</a> The thread is old, but the practice and principle of digital dumpster diving" still hold true.
 Cancel
Errioxa

2015-05-26T03:04:05-07:00

Hi, I wrote a post, how log analisys, but is in spanish :(

You can use Google translator, the post here
https://www.mecagoenlos.com/Posicionamiento/usar-lo...

And a video example :)
https://www.mecagoenlos.com/logs.avi

Errioxa edited 2015-05-26T03:04:28-07:00
5 0

Hi, I wrote a post, how log analisys, but is in spanish :( You can use Google translator, the post here <a href="https://www.mecagoenlos.com/Posicionamiento/usar-los-logs-para.php" rel="nofollow">https://www.mecagoenlos.com/Posicionamiento/usar-lo...</a> And a video example :) <a href="https://www.mecagoenlos.com/logs.avi" rel="nofollow">https://www.mecagoenlos.com/logs.avi</a> 
Cancel
- sotelor10
 
 2015-05-30T10:56:13-07:00
 
 Excellent article.
 
 Your article is also good Errioxa, it is best that is in Spanish and can read fluid.
 
 This is one of the techniques I have not seen many SEOs do, largely because it involves getting your hands dirty, thanks for sharing.
 
 greetings.
 
 1 0
 
 Excellent article. Your article is also good Errioxa, it is best that is in Spanish and can read fluid. This is one of the techniques I have not seen many SEOs do, largely because it involves getting your hands dirty, thanks for sharing. greetings.
 Cancel
Peter Nikolow

2015-05-26T05:58:41-07:00

Great post! As long-term developer i know that log files contain everything and they're only one source that we can trust.

Because log analysis can be boring procedure and external tools can be used. One of them is well known Sawmill . There are also other analysis software on market. But all of them are expensive.

For small companies there is solution too! Using *nix tools (sort, uniq and awk/sed) can save lot of time. Few examples:
awk -F\" '{print $6}' access.log | sort | uniq -c | sort -fr
this will sort all user agents on their frequency

awk '{print $9}' access.log | sort | uniq -c | sort
this will show all HTTP statuses

awk '($9 ~ /404/)' access.log | awk '{print $9,$7}' | sort
this will shown only 404 requests

awk -F\" '{print $2}' access.log |sort|uniq -c|sort -nr |head -10
this will return top 10 URL requested

Of course combinations is almost endless. But sad news is that only *nix users (Linux, BSD, OSX and rest) can use commands native. For Windows users they must install cygwin package and then use that commands.

4 0

Great post! As long-term developer i know that log files contain everything and they're only one source that we can trust. Because log analysis can be boring procedure and external tools can be used. One of them is well known <a href="https://sawmill.net/" rel="nofollow">Sawmill</a> . There are also other analysis software on market. But all of them are expensive. For small companies there is solution too! Using *nix tools (sort, uniq and awk/sed) can save lot of time. Few examples: awk -F\" '{print $6}' access.log | sort | uniq -c | sort -fr this will sort all user agents on their frequency awk '{print $9}' access.log | sort | uniq -c | sort this will show all HTTP statuses awk '($9 ~ /404/)' access.log | awk '{print $9,$7}' | sort this will shown only 404 requests awk -F\" '{print $2}' access.log |sort|uniq -c|sort -nr |head -10 this will return top 10 URL requested Of course combinations is almost endless. But sad news is that only *nix users (Linux, BSD, OSX and rest) can use commands native. For Windows users they must install cygwin package and then use that commands. 
Cancel
don_quixote

2015-05-26T01:33:54-07:00

Great article, as I should be able to create my own system, that updates me on google's bot on-site activities. Bit of data harvesting that I had not considered. Turn that into a dashboard component of some description. It should not be that hard. Excellent thanks for the insight.

4 0

Great article, as I should be able to create my own system, that updates me on google's bot on-site activities. Bit of data harvesting that I had not considered. Turn that into a dashboard component of some description. It should not be that hard. Excellent thanks for the insight. 
Cancel
Carlos Doncel

2015-05-28T03:18:16-07:00

Great post Samuel!

It is one of the best guides of log file analysis I've read.

Thanks to your article I have new knowledge for the seo world.

Thank you!!!

3 0

Great post Samuel! It is one of the best guides of log file analysis I've read. Thanks to your article I have new knowledge for the seo world. Thank you!!!
Cancel
naturaprint

2015-05-27T03:36:41-07:00

Amazing all the information I have achieved thanks to your indications, not yet to start. I 've never trusted a lot of the results of the google tools , now if I have reliable data. Thank you very much for the information. regards

2 0

Amazing all the information I have achieved thanks to your indications, not yet to start. I 've never trusted a lot of the results of the google tools , now if I have reliable data. Thank you very much for the information. regards
Cancel
Mustansar

2015-05-29T02:58:30-07:00

Great post...i do agree many of us don't really focus on log and try figure out what is going on with the search engines bot before hand until we get some alert.

good work.

2 0

Great post...i do agree many of us don't really focus on log and try figure out what is going on with the search engines bot before hand until we get some alert. good work. 
Cancel
merlinox

2015-05-26T01:38:22-07:00

Hi Scoot, very good post. A little note: Google seems don't consider prioriry and frequency on sitemap https://www.seroundtable.com/google-priority-chang...

For log I created a simply windows tool called "GrepMrx": if you want you can try it (it's fully free) to grep big log files, filtering by regexp.

Thanks

2 0

Hi Scoot, very good post. A little note: Google seems don't consider prioriry and frequency on sitemap <a href="https://www.seroundtable.com/google-priority-change-frequency-xml-sitemap-20273.html" rel="nofollow">https://www.seroundtable.com/google-priority-chang...</a> For log I created a simply windows tool called "GrepMrx": if you want you can try it (it's fully free) to grep big log files, filtering by regexp. Thanks 
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T11:13:31-07:00
 
 Thanks for the reference! To be honest, I had never seen John Mueller's statement on that before. As a result, I don't know enough have an opinion -- I never really want to give thoughts on something I don't know.
 
 So, I'll throw it open to Mozzers! Thoughts, anyone?
 
 2 0
 
 Thanks for the reference! To be honest, I had never seen John Mueller's statement on that before. As a result, I don't know enough have an opinion -- I never really want to give thoughts on something I don't know. So, I'll throw it open to Mozzers! Thoughts, anyone?
 Cancel
 - merlinox
 
 2015-05-27T00:17:18-07:00
 
 I think sitemap today has those values:
 
 linking images with pages
 linking language versions
 check index status (ex: a sitemap for every categories) thus GWT hasn't a good index explorer like Bing Webmaster Tools
 
 Priority, Frequency, ... are all by navigation tree, internal links and backlinks.
 
 Thanks!
 
 1 0
 I think sitemap today has those values: <ul><li>linking images with pages</li><li>linking language versions</li><li>check index status (ex: a sitemap for every categories) thus GWT hasn't a good index explorer like Bing Webmaster Tools</li></ul> Priority, Frequency, ... are all by navigation tree, internal links and backlinks. Thanks! 
 Cancel
Vanessa Sánchez

2015-05-28T03:27:09-07:00

This post is great! Although it's very technical, it's really easy to understand. Thank you for sharing this valuable information :)

2 0

This post is great! Although it's very technical, it's really easy to understand. Thank you for sharing this valuable information :) 
Cancel
Kaitlin McMichael

2015-05-26T09:21:36-07:00

Great article. I'm keen to try the ELK stack and see if it is scalable for analyzing very large websites (with billion+ pages).

I'm also wondering about the same questions as @Oneclickhere... are you recommending blocking CSS resources? Google has made it clear that we should keep JS and CSS files crawlable.

2 0

Great article. I'm keen to try the ELK stack and see if it is scalable for analyzing very large websites (with billion+ pages). I'm also wondering about the same questions as @Oneclickhere... are you recommending blocking CSS resources? Google has made it clear that we should keep JS and CSS files crawlable. 
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T11:22:32-07:00
 
 Thanks for the comment! That's an important point -- I should have been more precise with my example.
 
 Google DOES need to crawl JavaScript and CSS -- but other things such as infinite calendar scrolling should be blocked.
 
 2 0
 
 Thanks for the comment! That's an important point -- I should have been more precise with my example. Google <a href="https://googlewebmastercentral.blogspot.co.il/2014/05/understanding-web-pages-better.html" rel="nofollow">DOES need to crawl JavaScript and CSS</a> -- but other things such as infinite calendar scrolling <a href="https://developers.google.com/webmasters/control-crawl-index/docs/faq#h02" rel="nofollow">should be blocked</a>.
 Cancel
Roman Bębenista

2015-05-26T03:05:08-07:00

Hi Samuel.

great post! Thank you.

Are you able to find out exact paths that Google follows, to identify issues with internal linking? Or, are you able to check each page crwaling vs. it's status of "noindex" ?

Are you able to "draw" website's map/structure graph with this tool?

Are there any other seo-related benefits from such analysis, that you didn't mention i nthe article?

Thanks in advance :)

Br,
Roman

2 0

Hi Samuel. great post! Thank you. Are you able to find out exact paths that Google follows, to identify issues with internal linking? Or, are you able to check each page crwaling vs. it's status of "noindex" ? Are you able to "draw" website's map/structure graph with this tool? Are there any other seo-related benefits from such analysis, that you didn't mention i nthe article? Thanks in advance :) Br, Roman
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T11:15:38-07:00
 
 Roman, thank you for the comment. I really don't want to bore the community by talking about my company's product. So, feel free to e-mail me at samuel (at) logz.io about the first three questions.
 
 However, I'm always open to questions about log analysis in general! Regarding your last question, I think I covered every SEO use of log analytics -- but I'm always open to suggestions from Mozzers in case I missed anything. :)
 
 1 0
 
 Roman, thank you for the comment. I really don't want to bore the community by talking about my company's product. So, feel free to e-mail me at samuel (at) logz.io about the first three questions. However, I'm always open to questions about log analysis in general! Regarding your last question, I think I covered every SEO use of log analytics -- but I'm always open to suggestions from Mozzers in case I missed anything. :)
 Cancel
Torben Henke

2015-05-26T04:35:23-07:00

You say:

'"GET" is one of the two commands (the other is "POST") that can be performed.'

However, there a more than just those two "methods" in the HTTP-Protocol. The most important one in this case being the HEAD Request. It is often used by crawlers if they only want to check the availability of a page.

You can find more Information about HTTP Methods at https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.h...

2 0

You say: '"GET" is one of the two commands (the other is "POST") that can be performed.' However, there a more than just those two "methods" in the HTTP-Protocol. The most important one in this case being the HEAD Request. It is often used by crawlers if they only want to check the availability of a page. You can find more Information about HTTP Methods at <a href="https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html" rel="nofollow">https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.h...</a>
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T10:51:21-07:00
 
 Yes, I should have been more precise in my description there. Thanks so much for the clarification! :)
 
 1 0
 
 Yes, I should have been more precise in my description there. Thanks so much for the clarification! :)
 Cancel
Andy-Halliday

2015-05-26T02:13:33-07:00

great article, it has given me a long list of what I know need to check.

I know need to convince the developers to give me access to the log files - that will be a bigger challenges that analysing the data.

2 0

great article, it has given me a long list of what I know need to check. I know need to convince the developers to give me access to the log files - that will be a bigger challenges that analysing the data.
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T10:46:31-07:00
 
 Yes, just getting the data can be a chore -- especially if you want to review the logs on a regular basis. It's why I recommend that one use either proprietary software or the ELK stack and set them up once with log-file feeds.
 
 1 0
 
 Yes, just getting the data can be a chore -- especially if you want to review the logs on a regular basis. It's why I recommend that one use either proprietary software or the ELK stack and set them up once with log-file feeds.
 Cancel
Salman Sharif

2015-05-29T03:13:30-07:00

Great analysis Samuel, I have a question here: We know that we cannot control Google's crawler delay from robots.txt what about the option giving in the GWT? Does it work from there?

1 0

Great analysis Samuel, I have a question here: We know that we cannot control Google's crawler delay from robots.txt what about the option giving in the GWT? Does it work from there?
Cancel
Jose Ramon Saura

2016-01-13T01:21:04-08:00

This is an interesting article about SEO Log analysis. Thank you so much four your clarity and easy way to explain this process.

1 0

This is an interesting article about SEO Log analysis. Thank you so much four your clarity and easy way to explain this process. 
Cancel
tomward

2015-06-02T02:39:01-07:00

Thanks a lot, this is a great!

1 0

Thanks a lot, this is a great!
Cancel
PhPh

2015-06-03T11:50:10-07:00

Nice post !

if you are familiar with PIWIK, also may try to use a feature : the 'Log Analytics'

MeganSingley edited 2015-06-03T12:49:56-07:00
1 0

Nice post ! if you are familiar with PIWIK, also may try to use a feature : the 'Log Analytics' 
Cancel
Toby Bateson

2015-06-30T02:30:20-07:00

Crikey, this is serious stuff!

1 0

Crikey, this is serious stuff!
Cancel
Foxxx CEO

2015-06-24T10:49:25-07:00

Great article, it has given me a long list of what I know need to check.

1 0

Great article, it has given me a long list of what I know need to check.
Cancel
Federico Schafer

2015-06-03T20:13:50-07:00

Great Post! You can find a lot of SEO related posts out there, but this one is pretty technical and talks about aspects you don't easily find elsewhere. Thanks!

1 0

Great Post! You can find a lot of SEO related posts out there, but this one is pretty technical and talks about aspects you don't easily find elsewhere. Thanks!
Cancel
Ryan Hughson

2015-05-28T11:10:52-07:00

Im not in my IT department and they tell me that our site files are too big to pull. I am not an expert so is this something you think could happen?

1 0

Im not in my IT department and they tell me that our site files are too big to pull. I am not an expert so is this something you think could happen?
Cancel
- Samuel Scott - @samueljscott
 
 2015-07-24T08:26:36-07:00
 
 The log files themselves might be very large indeed -- it's why the best solution is to setup automatic, continuous feeds one time into one's desired proprietary log analysis software or the ELK Stack.
 
 1 0
 
 The log files themselves might be very large indeed -- it's why the best solution is to setup automatic, continuous feeds one time into one's desired proprietary log analysis software or the ELK Stack.
 Cancel
nappoler hu

2015-08-10T06:49:20-07:00

What's amazing ! log analysis is very important for the lager website, web log explorer is very good for log analysis.

1 0

What's amazing ! log analysis is very important for the lager website, web log explorer is very good for log analysis. 
Cancel
EstanislaoBerruezo

2015-05-27T00:01:19-07:00

Good article . It focuses a lot on a technical analysis of the data but does not hurt to remember periodically to data analysis is an essential part of SEO . Without a good analysis of the data of our strategy it will be very difficult to get results.

1 0

Good article . It focuses a lot on a technical analysis of the data but does not hurt to remember periodically to data analysis is an essential part of SEO . Without a good analysis of the data of our strategy it will be very difficult to get results.
Cancel
Zohaib Khan

2015-05-26T08:02:46-07:00

These files exist typically for technical site auditing & troubleshooting, but can be extremely valuable for SEO auditing as well.

1 0

These files exist typically for technical site auditing & troubleshooting, but can be extremely valuable for SEO auditing as well.
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T10:50:12-07:00
 
 I agree -- it's why log analysis is one of the most under-discussed topics in SEO!
 
 1 0
 
 I agree -- it's why log analysis is one of the most under-discussed topics in SEO!
 Cancel
Oneclickhere

2015-05-26T03:30:39-07:00

Hi Samual, I haven't read the entire article yet but the headline intrigued me so I bookmarked it for later :) However, as I skimmed through the article, i noticed you mentioned the words "Crawl budget wasted" on a .css file... so, to draw a quick conclusion to this, one would block that .css file from being crawled, but then Google has a fit when doing a PageSpeed Insights test and says your website is blocking css or js files. How do we "safely" block css and js files from being crawled without annoying Google?

1 0

Hi Samual, I haven't read the entire article yet but the headline intrigued me so I bookmarked it for later :) However, as I skimmed through the article, i noticed you mentioned the words "Crawl budget wasted" on a .css file... so, to draw a quick conclusion to this, one would block that .css file from being crawled, but then Google has a fit when doing a PageSpeed Insights test and says your website is blocking css or js files. How do we "safely" block css and js files from being crawled without annoying Google?
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-26T11:23:58-07:00
 
 Thanks for the comment! I'm copying in part my response to a Mozzer with a similar question.
 
 I should have been more precise with my example. Google DOES need to crawl JavaScript and CSS -- but other things such as infinite calendar scrolling should be blocked.
 
 SamuelScott edited 2015-05-26T11:24:19-07:00
 1 0
 
 Thanks for the comment! I'm copying in part my response to a Mozzer with a similar question. I should have been more precise with my example. Google <a href="https://googlewebmastercentral.blogspot.co.il/2014/05/understanding-web-pages-better.html" rel="nofollow">DOES need to crawl JavaScript and CSS</a> -- but other things such as infinite calendar scrolling <a href="https://developers.google.com/webmasters/control-crawl-index/docs/faq#h02" rel="nofollow">should be blocked</a>.
 Cancel
Samuel Scott - @samueljscott

2015-05-26T00:35:27-07:00

Hello, Mozzers! I'm the author of this essay -- I'd love to hear any thoughts and comments you have on the post. Feel free to post, and I'll respond as soon as I can. :)

2 1

Hello, Mozzers! I'm the author of this essay -- I'd love to hear any thoughts and comments you have on the post. Feel free to post, and I'll respond as soon as I can. :)
Cancel
- simonediroma
 
 2015-05-26T03:06:33-07:00
 
 great article Samuel, thank you for sharing! just a question about wasting resources, in your opinion there is a way to reduce but not stop crawler hits on a certain resource? I usually prefer to avoid robots.txt rules just to be compliant with jonh mueller advices...
 
 2 0
 
 great article Samuel, thank you for sharing! just a question about wasting resources, in your opinion there is a way to reduce but not stop crawler hits on a certain resource? I usually prefer to avoid robots.txt rules just to be compliant with jonh mueller advices...
 Cancel
 - Samuel Scott - @samueljscott
 
 2015-05-26T10:49:33-07:00
 
 I'd suggest looking at the crawl-priority settings in your XML sitemap. You can set different resources to be crawled daily, monthly, or at other intervals. Of course, these are only suggestions -- Google can still decide on its own to crawl at its own rates.
 
 1 0
 
 I'd suggest looking at the <a href="https://moz.com/blog/xml-sitemaps-guidelines-on-their-use" rel="nofollow">crawl-priority settings in your XML sitemap</a>. You can set different resources to be crawled daily, monthly, or at other intervals. Of course, these are only suggestions -- Google can still decide on its own to crawl at its own rates.
 Cancel
Bobvanveen

2015-05-28T03:37:29-07:00

Hi Samuel,

Great post about using server logs for SEO. Nice and Interesting to see the 'hook' to clients. To many errors, duplicate content issues are wasting your (daily) crawlbudget and stand in the way of your website getting indexed right.
Thanks!

-Bob

Bobvanveen edited 2015-05-28T03:37:47-07:00
1 0

Hi Samuel, Great post about using server logs for SEO. Nice and Interesting to see the 'hook' to clients. To many errors, duplicate content issues are wasting your (daily) crawlbudget and stand in the way of your website getting indexed right. Thanks! -Bob
Cancel
Trimantra Software

2015-05-26T22:35:44-07:00

I always question about servers site Log entry.

We know the Client site logs but we did not identify the effect of our work on server site.

What actually server do after our work was done. What types of effect of our content on serverside.

Main thing about website caching of our website content.

1 0

I always question about servers site Log entry. We know the Client site logs but we did not identify the effect of our work on server site. What actually server do after our work was done. What types of effect of our content on serverside. Main thing about website caching of our website content. 
Cancel
Burcea Adrian

2015-05-27T21:20:34-07:00

Hello,

Can i do all of this from the Cpannel of my host?

I have www.CatalogServicii.com hosted at www.claus.ro

cretzu89 edited 2015-05-27T21:21:04-07:00
1 0

Hello, Can i do all of this from the Cpannel of my host? I have www.CatalogServicii.com hosted at www.claus.ro
Cancel
Aimee Jarboe

2015-05-27T07:03:43-07:00

THANK YOU for including Nginx links! I moved my personal site's web server to Nginx last year, and this is the first article I've not had to follow with immediate research :)

1 0

THANK YOU for including Nginx links! I moved my personal site's web server to Nginx last year, and this is the first article I've not had to follow with immediate research :)
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-27T07:09:46-07:00
 
 My pleasure! I wanted to include three of the most-popular types of servers for exactly that reason. :)
 
 1 0
 
 My pleasure! I wanted to include three of the most-popular types of servers for exactly that reason. :)
 Cancel
gillyb

2015-05-26T00:31:16-07:00

Great article. I never thought about analyzing the response codes of my 'crawl budget'.

1 0

Great article. I never thought about analyzing the response codes of my 'crawl budget'.
Cancel
libety68

2015-05-26T21:11:22-07:00

thank you for giving me the most amazing thing I've ever seen

1 2

thank you for giving me the most amazing thing I've ever seen
Cancel
- Samuel Scott - @samueljscott
 
 2015-05-27T04:32:04-07:00
 
 Server log files are the most amazing thing you've ever seen?
 
 SamuelScott edited 2015-05-27T04:32:20-07:00
 3 0
 
 Server log files are the most amazing thing you've ever seen?
 Cancel

Post Analytics