Good news, everyone: November's Mozscape index is here! And it's arrived earlier than expected.
Of late, we've faced some big challenges with the Mozscape index — and that's hurt our customers and our data quality. I'm glad to say that we believe a lot of the troubles are now behind us and that this index, along with those to follow, will provide higher-quality, more relevant, and more useful link data.
Here are some details about this index release:
- 144,950,855,587 (145 billion) URLs
- 4,693,987,064 (4 billion) subdomains
- 198,411,412 (198 million) root domains
- 882,209,713,340 (882 billion) links
- Followed vs nofollowed links
- 3.25% of all links found were nofollowed
- 63.49% of nofollowed links are internal
- 36.51% are external
- Rel canonical: 24.62% of all pages employ the rel=canonical tag
- The average page has 81 links on it
- 69 internal links on average
- 12 external links on average
- Correlations with Google Rankings:
- Page Authority: 0.363
- Domain Authority: 0.245
- Linking Root Domains to URL: 0.306
You'll notice this index is a bit smaller than much of what we've released this year. That's intentional on our part, in order to get fresher, higher-quality stuff and cut out a lot of the junk you may have seen in older indices. DA and PA scores should be more accurate in this index (accurate meaning more representative of how a domain or page will perform in Google based on link equity factors), and that accuracy should continue to climb in the next few indices. We'll keep a close eye on it and, as always, report the metrics transparently on our index update release page.
What's been up with the index over the last year?
Let’s be blunt: the Mozscape index has had a hard time this year. We've been slow to release, and the size of the index has jumped around.
Before we get down into the details of what happened, here's the good news: We're confident that we have found the underlying problem and the index can now improve. For our own peace of mind and to ensure stability, we will be growing the index slowly in the next quarter, planning for a release at least once a month (or quicker, if possible).
Also on the bright side, some of the improvements we made while trying to find the problem have increased the speed of our crawlers, and we are now hitting just over a billion pages a day.
We had a bug.
There was a small bug in our scheduling code (this is different from the code that creates the index, so our metrics were still good). Previously, this bug had been benign, but due to several other minor issues (when it rains, it pours!), it had a snowball effect and caused some large problems. This made identifying and tracking down the original problem relatively hard.
The bug had far-reaching consequences...
The bug was causing lower-value domains to be crawled more frequently than they should have been. This happened because we crawled a huge number of low-quality sites for a 30-day period (we'll elaborate on this further down), and then generated an index with them. In turn, this raised all these sites' Domain Authority above a certain threshold where they would have otherwise been ignored, when the bug was benign. Now that they crossed this threshold (from a DA of 0 to a DA of 1), the bug was acting on them, and when crawls were scheduled, these domains were treated as if they had a DA of 5 or 6. Billions of low-quality sites were flooding the schedule with pages that caused us to crawl fewer pages on high-quality sites because we were using the crawl budget to crawl lots of low-quality sites.
...And index quality was affected.
We noticed the drop in high-quality domain pages being crawled. As a result, we started using more and more data to build the index, increasing the size of our crawler fleet so that we expanded daily capacity to offset the low numbers and make sure we had enough pages from high-quality domains to get a quality index that accurately reflected PA/DA for our customers. This was a bit of a manual process, and we got it wrong twice: once on the low side, causing us to cancel index #49, and once on the high side, making index #48 huge.
Though we worked aggressively to maintain the quality of the index, importing more data meant it took longer to process the data and build the index. Additionally, because of the odd shape of some of the domains (see below) our algorithms and hardware cluster were put under some unusual stress that caused hot spots in our processing, exaggerating some of the delays.
However, in the final analysis, we maintained the approximate size and shape of good-quality domains, and thus PA and DA were being preserved in their quality for our customers.
There were a few contributing factors:
We imported a new set of domains from a partner company.
We basically did a swap with them. We showed them all the domains we had seen, and they would show us all the domains they had seen. We had a corpus of 390 million domains, while they had 450 million domains. A lot of this was overlap, but afterwards, we had approximately 470 million domains available to our schedulers.
On the face of it, that doesn't sound so bad. However, it turns out a large chunk of the new domains we received were domains in .pw and .cn. Not a perfect fit for Moz, as most of our customers are in North America and Europe, but it does provide a more accurate description of the web, which in turn creates better Page/Domain Authority values (in theory). More on this below.
Palau, a small island nation in the middle of the Pacific Ocean.
Palau has the TLD of .pw. Seems harmless, right? In the last couple of years, the domain registrar of Palau has been aggressively marketing itself as the “Professional Web” TLD. This seems to have attracted a lot of spammers (enough that even Symantec took notice).
The result was that we got a lot of spam from Palau in our index. That shouldn't have been a big deal, in the grand scheme of things. But, as it turns out, there's a lot of spam in Palau. In one index, domains with the .pw extension reached 5% of the domains in our index. As a reference point, that’s more than most European countries.
More interestingly, though, there seem to be a lot of links to .pw domains, but very few outlinks from .pw to any other part of the web.
Here's a graph showing the outlinks per domain for each region of the index:
China and its subdomains (also known as FQDNs).
In China, it seems to be relatively common for domains to have lots of subdomains. Normally, we can handle a site with a lot of subdomains (blogspot.com and wordpress.com are perfect examples of sites with many, many subdomains). But within the .cn TLD, 2% of domains have over 10,000 subdomains, and 80% have several thousand subdomains. This is much rarer in the North Americas and in Europe, in spite of a few outliers like Wordpress and Blogspot.
Historically, the Mozcape index has slowly grown the total number of FQDNs, from ¼ billion in 2010 to 1 billion in 2013. Then, in 2014, we started to expand and got 6 billion FQDNs in the index. In 2015, one index had 56 billion FQDNs!
We found that a whopping 45 billion of those FQDNS were coming from only 250,000 domains. That means, on average, these sites had 180,000 subdomains each. (The record was 10 million subdomains for a single domain.)
Chinese sites are fond of links.
We started running across pages with thousands of links per page. It's not terribly uncommon to have a large number of links on a particular page. However, we started to run into domains with tens of thousands of links per page, and tens of thousands of pages on the same site with these characteristics.
At the peak, we had two pages in the index with over 16,000 links on each of these pages. These could have been quite legitimate pages, but it was hard to tell, given the language barrier. However, in terms of SEO analysis, these pages were providing very little link equity and thus not contributing much to the index.
This is not exclusively a problem with the .cn TLD; this happens on a lot of spammy sites. But we did find a huge cluster of sites in the .cn TLD that were close together lexicographically, causing a hot spot in our processing cluster.
We had a 12-hour DNS outage that went unnoticed.
DNS is the backbone of the Internet. It should never die. If DNS fails, the Internet more or less dies, as it becomes impossible to lookup the IP address of a domain. Our crawlers, unfortunately, experienced a DNS outage.
The crawlers continued to crawl, but marked all the pages they crawled as DNS failures. Generally, when we have a DNS failure, it’s because a domain has "died," or been taken offline. (Fun fact: the average life expectancy of a domain is 40 days.) This information is passed back to the schedulers, and the domain is blacklisted for 30 days, then retried. If it fails again, then we remove it from the schedulers.
In a 12-hour period, we crawl a lot of sites (approximately 500,000). We ended up banning a lot of sites from being recrawled for a 30-day period, and many of them were high-value domains.
Because we banned a lot of high-value domains, we filled that space with lower-quality domains for 30 days. This isn't a huge problem for the index, as we use more than 30 days of data — in the end, we still included the quality domains. But it did cause a skew in what we crawled, and we took a deep dive into the .cn and .pw TLDs.
This caused the perfect storm.
We imported a lot of new domains (whose initial DA is unknown) that we had not seen previously. These would have been crawled slowly over time and would likely have resulted in their domains to be assigned a DA of 0, because their linkage with other domains in the index would be minimal.
But, because we had a DNS outage that caused a large number of high-quality domains to be banned, we replaced them in the schedule with a lot of low-quality domains from the .pw and .cn TLDs for a 30-day period. These domains, though not connected to other domains in the index, were highly connected to each other. Thus, when an index was generated with this information, a significant percentage of these domains gained enough DA to make the bug in scheduling non-benign.
With lots of low-quality domains now being available for scheduling, we used up a significant percentage of our crawl budget on low-quality sites. This had the effect of making our crawl of high-quality sites more shallow, while the low-quality sites were either dead or very slow to respond — this caused a reduction in the total number of actual pages crawled.
Another side effect was the shape of the domains we crawled. As noted above, domains with the .pw and .cn TLDs seem to have a different strategy in terms of linking — both externally to one other and internally to themselves — in comparison with North American and European sites. This data shape caused a couple of problems when processing the data that increased the required time to process the data (due to the unexpected shape and the resulting hot spots in our processing cluster).
What measures have we taken to solve this?
We fixed the originally benign bug in scheduling. This was a two-line code change to make sure that domains were correctly categorized by their Domain Authority. We use DA to determine how deeply to crawl a domain.
During this year, we have increased our crawler fleet and added some extra checks in the scheduler. With these new additions and the bug fix, we are now crawling at record rates and seeing more than 1 billion pages a day being checked by our crawlers.
We've also improved.
There's a silver lining to all of this. The interesting shapes of data we saw caused us to examine several bottlenecks in our code and optimize them. This helped improve our performance in generating an index. We can now automatically handle some odd shapes in the data without any intervention, so we should see fewer issues with the processing cluster.
More restrictions were added.
- We have a maximum link limit per page (the first 2,000).
- We have banned domains with an excessive number of subdomains.
- Any domain that has more than 10,000 subdomains has been banned...
- ...Unless it is explicitly whitelisted (e.g. Wordpress.com).
- We have ~70,000 whitelisted domains.
- This ban affects approximately 250,000 domains (most with .cn and .pw TLDs)...
- ...and has removed 45 billion subdomains. Yes, BILLION! You can bet that was clogging up a lot of our crawl bandwidth with sites Google probably doesn't care much about.
We made positive changes.
- Better monitoring of DNS (complete with alarms).
- Banning domains after DNS failure is not automatic for high-quality domains (but still is for low-quality domains).
- Several code quality improvements that will make generating the index faster.
- We've doubled our crawler fleet, with more improvements to come.
Now, how are things looking for 2016?
Good! But I've been told I need to be more specific. :-)
Before we get to 2016, we still have a good portion of 2015 to go. Our plan is stabilize the index at around 180 billion URLs for the end of the year and release an index predictably every three weeks.
We are also in the process of improving our correlations to Google’s index. Currently our fit is pretty good at a 75% match, but we've been higher at around 80%; we're testing a new technique to improve our metrics correlations and Google coverage beyond that. This will be an ongoing process, and though we expect to see improvements in 2015, these improvements will continue on into 2016.
Our index struggles this year have taught us some very valuable lessons. We've identified some bottlenecks and their causes. We're going to attack these bottlenecks and improve the performance of the processing cluster to get the index out quicker for you.
We've improved the crawling cluster and now exceed a billion pages a day. That's a lot of pages. And guess what? We still have some spare bandwidth in our data center to crawl more sites. We plan to improve the crawlers to increase our crawl rate, reducing the number of historical days in our index and allowing us to see much more recent data.
In summary, in 2016, expect to see larger indexes, at a more consistent time frame, using less historical data, that maps closer to Google's own index. And thank you for bearing with us, through the hard times and the good — we could never do it without you.
Postscript from Rand: Many folks have been asking about rising and falling Domain/Page Authority scores after this update. I've put together a comprehensive thread about why DA/PA fluctuate, and suggestions for how to use these scores here.
TL;DR — Remember that PA/DA are relative scores, tied to correlations against Google's rankings. Page Authority/Domain Authority for a site could fall after an update even if that site is gaining links and ranking better (and PA/DA may still be better predictive of Google's rankings, due to how the relative scaling works). Happy to answer questions as best I can here and in that Q+A thread.
11/24/2015 Update: - We had incorrectly low correlation numbers for PA/DA, which are now updated to reflect reality. Correlations with Google saw a +20% bump when we used the right numbers.
Hey folks,
I wanted to chime in with some early quality tests we have been running against the index. Bear with me for a moment because I will have to give some context. There is a tl;dr; at the bottom though for those of you who want to hastily skip my beautiful prose.
The Crawl Paradox: Bigger not Better
Before I joined Moz, I released a study that pointed out a somewhat paradoxical situation: it seemed that the deeper you crawled the web, the less the index looked like Google's. Upon further investigation, this made sense. All crawlers have to prioritize which page to visit next. Small differences in how we prioritize pages (and our competitors) vs. Google would cause the crawlers to diverge from one another more and more as they crawled deeper and deeper. Thus, the path to building a link index like Google's meant less focus on the biggest index, and more focus on an index that prioritizes, schedules, and crawls like Google. Quality had to do more with the shape of our index than size.
Measuring Quality:
After coming Moz in late August, I had the privilege to partner with a number of individuals and teams looking at index quality - big shout out to Neil and Dr. Matt for their help! While I am not even close to qualified to handle the kind of work the Big Data team addresses, I have had the opportunity to work on measuring the results. While we have built up a number of metrics to determine index quality, two that I kept a close eye on for this release were the relationship between Page Authority and Rankings, and our hit-rate for pages in Google's index (ie: are pages that rank in Google's search results also in MozScape). This one is also potentially paradoxical in that - bear with me here - NOT having coverage in Google can artificially boost your correlation metrics.
Imagine if you only crawled 1/10th of what Google did. Chances are, that 1/10th would be the most important pages because you would find them faster since they have so many links. You'd find Facebook, Wikipedia and Youtube really quickly in your crawl, but probably not joeschmoesblog.com. These popular pages would likely rank in the top 2 or 3 for important search phrases. Your small index would probably not include many of the pages that rank #8, #9 or #10. In fact, you could imagine that as you dropped down from #1 to #10, the odds that the page is in your index at all would drop dramatically. Thus, when you calculate quality metrics, the pages ranking near the bottom would mostly have 0s because they aren't in your index at all, and the ones at the top will likely have at least some authority. What happens when the ones at the top have some authority, and ones at the bottom have none? You get a positive correlation not because you have figured out Google's ranking factors, but because you just happened to not index the lower ranking pages!
So, here was my concern: Were our measures of predictive power going to plummet because we were doing a better job indexing the stuff Google indexes? Were we going to win one battle and lose another? Or can we increase our coverage of URLs in Google search results AND simultaneously increase our maintain our predictive capacity from Page Authority?
Hell yes we can!
The first thing I did when I heard that the index was live was re-run our quality metrics on these two issues. And the verdict is in. We both increased our hit rate in Google Search (by about 3%) AND we increased the predictive capacity of Page Authority (by about 1.7%).
This is exciting for a number of reasons. First, Moz's index is based on a rolling crawl. The improvements made by the team have only had a fractional impact on this index, the next 2 should be even better because the changes already made will have more and more influence over the whole index. Second, it shows that we can consciously increase the quality of our index relative to Google's and, finally, it shows that links still matter a lot :-)
tl;dr;
Despite a slightly smaller index, both our hit rate against Google indexation went up and our correlation with rankings went up. This is the definition of focusing on quality over quantity - we got the good stuff that matters to Google and quantified it better than before. Kudos to the Big Data team and everyone at Moz for putting MozScape on the right path.
You'll notice this index is a bit smaller than much of what we've released this year. That's intentional on our part, in order to get fresher, higher-quality stuff and cut out a lot of the junk you may have seen in older indices.
While I understand the problems and reducing the overall size helps, I have to reiterate for what feels like the tenth time that OSE needs to include *as much low quality stuff as possible* so that we can use it in link cleanup, penalty reversal and disavow files.
It does me literally no good to know I have a link from NYTimes.com I already know that. It does me a LOT of good to instead know I have a crappy link coming from 16mb or some other spam site. Does that make sense?
Thanks for your thoughts. We recognize that a bigger index has its values, but it also comes at a cost of less trustworthy metrics. If you have a chance, take a look at this research I did prior to joining Moz describing the inverse relationship between index size and proportional relation to Google's index.
That being said, putting my SEO hat back on for a second, if you are concerned about finding each and every link, you need to be using every link index out there, including Moz. There is huge disparity between the major link indexes in who is blocked by what sites. You would be surprised at how many sites block one or more of the link indexes bots. There literally millions of pages that you would miss were you to only subscribe to Moz + one other, or the others individually.
It is worth saying that the Big Data team has put in place improvements that will allow steady index growth over time. We will put up big numbers again, but not at the expense of quality. Our goal isn't just to be the longest list of links out there, rather the best.
Thanks Russ - I've seen your link and do understand the "most Google-like" - from an SEO point of view, I also know that Search Console doesn't have all my links - not even all the ones Google counts. I've had clients come to us with penalties and notices from Google with sample links that aren't even found within SC ...
And maybe your last line in this reply and a line near the end of your linked post help me a lot:
"our goal isn't to be the longest list of links out there."
and
"Having the most Google-like data is certainly a crown worth winning."
And you're right, I appreciate that millions of sites are blocking Ahrefsbot, Rogerbot and all the others. We're never going to find 100% of the links.
One thing you said actually doesn't work for me and I have to address it - "a bigger index ... comes at the cost of less trustworthy metrics."
Pretty much everyone (who needs to know) knows it's easy to spam OSE until you get the number you want. I can make Page Authority go up +10 points on any small business root domain I want within a single index update. That would sure look like I know how to do my job if a client were to be shown both numbers before & after - but it doesn't mean anything other than you kept the good and discarded the bad. With Penguin, don't bad links help you define "trustworthy metrics"? We're in an era where it often feels that the bad links hurt you more than good links help you.
Thanks for the response. There are certainly risks with a smaller index like it being "gameable", although anyone who is paying their SEOs based on their ability to push up PA/DA is making a horrible mistake :-) With spam-score in place, it is easier than ever for clients to see if their SEOs are building trustworthy links in OSE, so hopefully we are mitigating some of those concerns.
That being said, while it is easier to manipulate a single metric within the system, this pales in comparison with the literally billions of URLs (and tens of billions of links) that a large index will pick up that are not in Google's index at all, throwing off the link graph as a whole.
I'm sure with the size of Moz's index and the crawling/rolling basis that updates are made it would be difficult, but is there a way to have a "Google-similar" link profile that more closely correlates with their ranking factors and a "Depth Crawl" that includes a LOT more lower quality links for the purposes of disavowing and keeping a handle on that. Like Russ says, he knows nytimes linked to him, but what about some other guy?
I hope that makes sense. I'm sure it is a huge undertaking to even start to try and tease that kind of density out of something so large.
Totally with you. We want to increase index size too, which is why we got aggressive earlier this year. However, as we saw (and as Martin shared in this post), bigger isn't better unless we're crawling the right stuff. We're working on that now, and you should see index size rise moderately in the short term and much more in the long term, but all with a greater eye to things that Google actually sees, crawls, keeps in main index, and counts toward rankings.
I realize that makes the Spam Score feature less valuable/comprehensive, but we believe it's the right choice to make to keep our metrics useful and our index high quality until we can get to a processing system that can support 10X the URL/link size (which is coming eventually). For that time being, sadly, my recommendation if you're working on link cleanup and trying to see mostly spam (rather than mostly good stuff) is to use an index like Majestic or Ahrefs.
Makes sense Rand. I know there are always going to be choices when you have to prioritize the crawl. You and I exchanged a couple thoughts in a previous comment section about the index sizes and how (honestly) great Ahrefs does. (Majestic Fresh vs Majestic Historic is a different battle ...)
As a very-active Moz member I get a LOT of questions on Twitter & in person about Moz. When I talk to other practicing SEOs about OSE, the question they always ask me is "how can I use it?"
What may be a really great post or feature is for someone to write up the ways OSE can help. It can be part of link cleanup, it's great for competitor research, it gives you a bit of "insight" into what Google sees (minus the spam), etc.
Completely understand (at a certain level) the complex challenges you guys face and 100% respect the work the team does to make this index rock for everyone. :)
That would actually be a really awesome write up from Moz - more of a tutorial of the best ways to utilize OSE day to day. When doing link cleanups I've used Google SC data, Moz, aHrefs, and really anything else I was able to get my hands on to get a bigger picture of the link profile (since every company has a slightly different index), so I agree Moz alone isn't great as a one-all solution.
Are there already resources on the Moz site outlining the best ways to use OSE? I think that would be really useful.
Yeah - I think we can certainly do that. As Matt points out, if you're doing link cleanup, you want every link source available (which means Moz, Ahrefs, Majestic, and Google Search Console). For other applications, one index or a couple of them may be required.
my previous DA is 21
and now its 11 why?
https://www.marvelindia.com/
Hi there! Check out the Q&A post Rand wrote here that explains some reasons why Domain Authority can drop with an index update: https://moz.com/community/q/da-pa-fluctuations-how...
I hope that helps! Always feel free to write into the friendly Help Team at [email protected] if you need further assistance, too. :)
Hi, Noticed huge amount of DA/PA down in this update, is Moz changed their index algo OR some bug OR what?
Few words from Moz may help as i know its a sad day for moz (DA/PA) lovers :(
Check out https://moz.com/community/q/da-pa-fluctuations-how... which has a lot more detail on DA/PA fluctuations specifically.
Can someone confirm exactly why DA has dropped? Is it just because you're no longer indexing the old crappy domains that might link to us?
Yes - check out https://moz.com/community/q/da-pa-fluctuations-how... which has more detail on why DA/PA fluctuate. In this update, I suspect you're correct that a large portion is related to the bias to crawl quality stuff and ignore a lot of the domains we'd injected that appeared not to provide good links/URLs (like the .cn and .pl stuff Martin described in the post).
I'm struggling to find out why my site dropped 12 points, down to 18 from 30. Competitors have dropped a few places, ie, 26 to 21? Dropping 12 points is a huge drop, previously, before I was 30, I was a 26, can someone please explain how this can happen? I appreciate any help and advice. Thank you.
Hi Jada - see this Q+A thread for more information on why scores can fluctuate. There are a variety of reasons, and any of them, or any combination, may be in play for your site: https://moz.com/community/q/da-pa-fluctuations-how...
Regardless, if your organic search traffic and rankings are good, I wouldn't worry about the DA drop. DA is a relative metric designed to be useful for comparisons against other sites. Correlations and coverage in this index have improved, so the metric should be more useful for that purpose (to see the relative distance between one site and another), and I wouldn't recommend trying to shoehorn it in as an absolute metric to measure progress over time. It scales with each index as our index shifts and as Google does, too.
It doesn't happened to your site only, just check the other comments.
I noticed even prominent newspaper websites lose 5 to 10 DA. Though finding correlation with Google rankings is good thing,exhaustiveness index is not totally irrelevant.Already,Moz was showing less than majestic and ahrefs. I do not know how it will affect it further.
I think one better solution would be to keep those in index but do not count those in DA/PA.
You've actually got our attention. We have started an internal discussion about holding on to untrustworthy links, almost like a supplemental index, for the purpose of helping webmasters with certain issues like link removal problems. Great idea!
This is a really good update. I want to wish the Moz big data engineers the best of luck in tackling the enormous challenge of web indexing!
Nice to see you Stephen :-)
Do you have stats on the all-over performance on domain-level for the new update?
At least in my industry 15/15 domains dropped significantly in their DA. On Average 15%. But samplesize is obviously small.
That's great (and very smart) that you're monitoring a number of domains in your space. It's the best way to see how relative performance is going.
My guess is that, in this update, we saw the sites at the largest DA levels (90-100) rising dramatically in the number and quality of links they earned, thus "stretching" the DA-scale and making many other sites fall in DA scores. This has happened before when we've biased our crawl away from lower quality stuff (like Martin noted above) and toward higher-quality stuff.
My website experienced a decline in PA 17 to PA 1 and DA 10 to DA 5. The same with mR and other values. Just the social numbers hold his count and grow up. How can I fix it? Would anyone the same thing happened? Thank you for your help!
Hi there! This post by Rand in our Q&A explains out some reasons why metrics may fluctuate with an index update: https://moz.com/community/q/da-pa-fluctuations-how...
It's not uncommon for those numbers to fluctuate from time to time, and it does seem that others experienced the same thing with this update - you're not alone. :) And please don't hesitate to contact our Help Team if you have any more in-depth questions! You can reach them at [email protected]. :)
We have seen a significant dip across in DA for all of our client accounts as well. We've earned great, credible backlinks lately for a lot of our clients and are still seeing a huge dip. Is our only option to wait and see over the next few months?
Please remember that a drop in DA doesn't mean that you didn't earn great links or that your rankings in Google have fallen. Since DA is on a 100-point scale, the relative scaling feature of it means that if folks at the top of the scale are earning more/better links at a very fast rate, it can stretch the scale such that other DA scores fall and it gets harder to earn higher DA scores. This can happen even as correlations and coverage with Google get better (as Russ noted above). The best way to determine if you're falling behind is to watch A) your rankings and organic traffic and B) a comparison against a dozen or two sites in your niche to see if they've also fallen/risen and how much. If your DA went down, but the competition fell further, it's probably a good sign for you and means you're gaining ground.
Thats great to know the new release of new index. However, there is few issue has been noticed, DA and PA is dancing dramatically. One of our client DA was 38 and now its 11 yesterday it was showing 19, where as yesterday it was showing 23 same story with PA.
Wonder to get stable data soon. :)
If you're seeing numbers fluctuate, please let us know by writing in to [email protected]. There should only be a short period (maybe an hour) yesterday when the numbers were actually changing over from one index to the next.
I'm monitoring approximately 20 domains in the German insurance market (comparison portals, insurance companies etc.). And they all lost DomainAuthority today. Some of them two-digit numbers.
Could this have anything to do with this update?
Yeah - almost certainly related to this update, and very possibly to the quantity of domains we were previously crawling and counting from .cn and .pl. Going forward, PA/DA should keep getting slightly more accurate (as Russ noted above).
Thanks! That saves me a lot of headache over the weekend ...
We saw decreases in PA/DA across the board as well, good to know we aren't the only ones!
Hi Loki Astari..thank you very much..
looks very nice
All great news! I'm thinking your partner for exchanging link data is one of the other top 4 link tool companies. I'm guessing someone will prove and identify who that is. Can you beat them to the chase and identify your link partner? You'll get the cred, the kudos, and the thank-yous for full transparency. It'll come out one way or the other, right?
No, it's DomainTools (a local Seattle company). They've been awesome to work with - they just have data on a lot of domains that have been registered that don't appear to impact Google's rankings (at least, not positively in the countries our customers care about).
I'm a broker of expired domains and i need to say that unfortunately the drop of MOZ metrics that i saw across all my domains after this update was dramatic & unfounded.
For example on one domain:
Over the last 30 days ahrefs was still picking backlinks that were created long ago.
https://oi67.tinypic.com/23lgjdx.jpg
+150 Backlinks
+20 RDs
However, this domain went from DA30 to DA23. Same thing experienced by dozens of domains in my stock.
Looking forward to the next update, i already want to forget this one...
You're a broker of expired domains and you use DA/PA? They're too easy to manipulate. Use TF/CF like (pretty much) everyone else.
Moz will never handle expired domains well for the same reason Majestic beats everyone. Majestic keeps the historic link profile. Once a link is broken, Moz & Ahrefs forget about it.
So over time, almost every expired domain will see its PA & DA drop completely out. Big drops just mean Moz re-indexed a lot that month.
I do agree about the how easily manipulated DA&PA can be, but i only sell domains with natural backlink profiles so don't know why i'd care about that. Also I just follows wthat my market demands, and unfortunately this include the MOZ metrics.
Had been selling domains for more than two years and i never saw a drop like this, that's why i even bother to comment.
Hey Matt,
"Once a link is broken, Moz & Ahrefs forget about it"
The part about Ahrefs is no longer true :)
Yep, I think that also.
TF/CF? Can you clarify :
TF = Trustflow / CF = Citation Flow --> Major Majestic Metrics
True.
It's Friday 13th for all websites around the world :-)
FYI, Palau is in the Pacific Ocean, not the Indian Ocean.
Thanks Gary - fixed up.
Great work! My DA got up!
There has been a drop in DA . Thanks to solve all our doubts and save a lot of time and headache.
Hi there! This post by Rand in our Q&A explains out some reasons why metrics may fluctuate with an index update: https://moz.com/community/q/da-pa-fluctuations-how... It's not uncommon for those numbers to fluctuate from time to time, and it does seem that others
I think there are still some errors in this / it is still rolling out to the accounts.
I have a client who's DA was 23 and in the account it says it is still 23, however in the Moz bar it says 26. Why would this happen?
We also have loads of clients and competitors that have stayed exactly the same, which isn't normal.
I would have to talk to the team behind the MozBar, but my guess is that you are just getting cached data that will refresh within a certain amount of time.
I thought that too but the results in the Moz bar don't match the old DA either. I.e. in Moz analytics it says:
Previous DA - 23
New DA - 23
Moz Bar then says 26
Can you send us details? The Mozscape API powers the Mozbar, OSE, and every other application that uses Moz link data, so it should all match after a brief period of switchover yesterday. Email [email protected] and they can look into it. Thanks!
Ill send it over now. Got a few clients with the issue.
Hello,
That also happened with me but in opposite way :-). The bar for some certain time indicated 25 instead of 21 which is now official website's DA, however, it stopped after a while and now indicates the right one. With your website, I believe it's going to increase to 26. Pls check it again after some time. Moz's crawl isn't an instant process of course, and it takes some time to show/ put into finalization the real values on either, bar and ordinary checker.
Great news! Thanks for the article.
Best,
VS
This is a fantastic update, well done Rand and the Mozzle team!
What does this mean exactly for Spam Score? I would imagine this update makes it less reliable now, right?
Yeah, that's correct (well, sorta - "reliable" might not be the best word, but "less-coverage of spammy sites" is certainly true). We've made the decision to bias to get the good stuff, try to improve our metrics, get more Google-shaped, and worry less about the dark, spammy corners of the web. However, this definitely means that Spam Score's coverage isn't as good as it would be if we broadly crawled everything (though that biasing would hurt everything else about our index, which is why we choose not to).