Great Scott! I am finally back again for another spectacularly lengthy post, rich with wonderful titles, and this time - statistical goodness. It just so happens, that in my past short-lived career, I was a Forecast Analyst (not this kind). So today class, we will be learning about the importance of forecasting organic traffic and how you can get started. Let's begin our journey.
Forecasting is Your Density. I Mean, Your Destiny
Why should I forecast? Besides the obvious answer - it’s f-ing cool to predict the future, there are a number of benefits for both you and your company.
Forecasting adds value in both an agency and in-house setting. It provides a more accurate way to set goals and plan for the future, which can be applied to client projects, internal projects, or overall team/dept. strategy.
Forecasting creates accountability for your team. It allows you to continually set goals based on projections and monitor performance through forecast accuracy (Keep in mind that exceeding goals is not necessarily a good thing, which is why forecast accuracy is important. We will discuss this more later).
Forecasting teaches you about inefficiencies in your team, process, and strategy. The more you segment your forecast, the deeper you can dive into finding the root of the inaccuracies in your projections. And the more granular you get, the more accurate your forecast, so you will see that segmentation is a function of accuracy (assuming you continually work to improve it).
Forecasting is money. This is the most important concept of forecasting, and probably the point in where you decided that you will read the rest of this article.
The fact that you can improve inefficiencies in your process and strategy through forecasting, means you can effectively increase ROI. Every hour and resource allocated to a strategy that doesn’t deliver results can be reallocated to something that proves to be a more stable source of increased organic traffic. So finding out what strategies consistently deliver the results you expect, means you’re investing money into resources that have a higher probability of delivering you a larger ROI.
Furthermore, providing accurate projections, whether it’s to a CFO, manager, or client, gives the reviewer a more compelling reason to invest in the work that backs the forecast. Basically, if you want a bigger budget to work with, forecast the potential outcome of that bigger budget and sell it. Sell it well.
Okay. Flux Capacitor, Fluxing. Forecast, Forecasting?
I am going to make the assumption that everyone’s DeLorean is in the shop, so how do we forecast our organic traffic?
There are four main factors to account for in an organic traffic forecast: historical trends, growth, seasonality, and events. Historical data is always the best place to start and create your forecast. You will want to have as many historical data points as possible, but the accuracy of the data should come first.
Determining the Accuracy of the Data
Once you have your historical data set, start analyzing it for outliers. An outlier to a forecast is what Biff is to George McFly, something you need to punch in the face and then make wash your car 20 years in the future. Well something like that.
The quick way to find outliers is to simply graph your data and look for spikes in the graph. Each spike is associated with a data point, which is your outlier, whether it spikes up or down. This way does leave room for error, as the determination of outliers is based on your judgement and not statistical significance.
The long way is much more fun and requires a bit of math. I'll provide some formula refreshers along the way.
Calculating the mean and the standard deviation of your historical data is the first step.
Mean
Standard Deviation
Looking at the standard deviation can immediately tell you whether you have outliers or not. The standard deviation tells you how close your data falls near the average or mean, so the lower the standard deviation, the closer the data points are to each other.
You can go a step further and set a rule by calculating the coefficient of variation (COV). As a general rule, if your COV is less than 1, the variance in your data is low and there is a good probability that you don’t need to adjust any data points.
Coefficient of Variation (COV)
If all the signs point to you having significant outliers, you will now need to determine which data points those are. A simple way to do this is calculate how many standard deviations away from the mean your data point is.
Unfortunately, there is no clear cut rule to qualify an outlier with deviations from the mean. This is due to the fact that every data set is distributed differently. However, I would suggest starting with any data point that is more than one deviation from the mean.
Making your decision about whether outliers exist takes time and practice. These general rules of thumb can help you figure it out, but it really relies on your ability to interpret the data and be able to understand how each data point affects your forecast. You have the inside knowledge about your website, your equations and graphs don’t. So put that to use and start making your adjustments to your data accordingly.
Adjusting Outliers
Ask yourself one question: Should we account for this spike? Having spikes or outliers is normal, whether you need to do anything about it is what you should be asking yourself now. You want to use that inside knowledge of yours to determine why the spike occurred, whether it will happen again, and ultimately whether it should accounted for in your future forecast.
In the case that you don’t want to account for an outlier, you will need to accurately adjust it down or up to the number it would have been without the event that caused the anomaly.
For example, let’s say you launched a super original infographic about the Olympics in July last year that brought your site an additional 2,000 visits that month. You may not want to account for this as it will not be a recurring event or maybe it fails to bring qualified organic traffic to the site (if the infographic traffic doesn’t convert, then your revenue forecast will be inaccurate). So the resulting action would be to adjust the July data point down 2,000 visits.
On the flipside, what if your retail electronics website has a huge positive spike in November due to Black Friday? You should expect that rise in traffic to continue this November and account for it in your forecast. The resulting action here is to simply leave the outlier alone and let the forecast do it’s business (This is also an example of seasonality which I will talk about more later).
Base Forecast
When creating your forecast, you want to create a base for it before you start incorporating additional factors into it. The base forecast is usually a flat forecast or a line straight down the middle of your charted data. In terms of numbers, this can be simply be using the mean for every data point. The line down the middle of the data follows the trend of the graph, so this would be the equivalent of the average but accounting for slope too. Excel provides a formula which actually does this for you:
=FORECAST(x, known_y's,known_x's)
Given the historical data, excel will output a forecast based on that data and the slope from the starting point to end point. Dependent on your data, your base forecast could be where you stop, or where you begin developing an accurate forecast.
Now how do you improve your forecast? It’s a simple idea - account for anything and everything the data might not be able to account for. Now you don’t need to go overboard here. I would draw the line well before you start forecasting the decrease in productivity on Fridays due to beer o clock. I suggest accounting for three key factors and accounting for them well; growth, seasonality, and events.
Growth
You have to have growth. If you aren’t planning to grow anytime soon, then this is going to be a really depressing forecast. Including growth can be as simple as adding 5% month over month, due to a higher level estimate from management, or as detailed as estimating incremental search traffic by keyword from significant ranking increases. Either way, the important part is being able to back your estimates with good data and know where to look for it. With organic traffic, growth can come from a number of sources but these are a couple key components to consider:
Are you launching new products?
New products means new pages, and dependent on your domain's authority and your internal linking structure, you can see an influx of organic traffic. If you have analyzed the performance of newly launched pages, you should be able to estimate on average what percentage of search traffic from relevant and target keywords they can bring over time.
Using Google Webmaster Tools CTR data and the Adwords Tool for search volume are your best bet to acquire the data you need to estimate this. You can then apply this estimate to search volumes for the keywords that are relevant to each new product page and determine the additional growth in organic traffic that new product lines will bring.
Tip: Make sure to consider your link building strategies when analyzing past product page data. If you built links to these pages over the analyzed time period, then you should plan on doing the same for the new product pages.
What ongoing SEO efforts are increasing?
Did you get a link building budget increase? Are you retargeting several key pages on your website? These things can easily be factored in, as long as you have consistent data to back it up. Consistency in strategy is truly an asset, especially in the SEO world. With the frequency of algorithm updates, people tend to shift strategies fairly quickly. However, if you are consistent, you can quantify the results of your strategy and use it improve your strategy and understand its effects on the applied domain.
The general idea here is that if you know historically the effect of certain actions on a domain, then you can predict how relative changes to the domain will affect the future (given there are no drastic algorithm updates).
Let's take a simple example. Let's say you build 10 links to a domain per month and the average Page Authority is 30 and Domain Authority is 50 for the targeted pages and domain when you started. Over time you see as a result, your organic traffic increase by 20% for the pages you targeted on this campaign. So if your budget increases and allows you to apply the same campaign to other pages on the website, you can estimate an increase in organic traffic of 20% to those pages.
This example assumes the new target pages have:
- Target keywords with similar search volumes
- Similar authority at prior to the campaign start
- Similar existing traffic and ranking metrics
- Similar competition
While this may be a lot to assume, this is for the purpose of the example. However, these are things that will need to be considered and these are the types of campaigns that should be invested in from a SEO standpoint. When you find a strategy that works, repeat it and control the factors as much as possible. This will provide for an outcome that is the least likely to diverge from expected results.
Seasonality
To incorporate seasonality into a organic traffic forecast, you will need to create seasonal indices for each month of the year. A seasonal index is an index of how that month's expected value relates to the average expected value. So in this case, it would be how each month's organic traffic compares with average or mean monthly organic traffic.
So let's say your average organic traffic is 100,000 visitors per month and your adjusted traffic for last November was 150,000 visitors, then your index for November is 1.5. In your forecast you simply multiply by this weight for the corresponding index month.
To calculate these seasonal indices, you need data of course. Using adjusted historical data is the best solution, if you know that it reflects the seasonality of the website's traffic well.
Remember all that seasonal search volume data the Adwords tool provides? That can actually be put to practical use! So if you haven't already, you should probably get with the times and download the Adwords API excel plugin from SEOgadget (if you have API access). This can make gathering seasonal data for a large set of keywords quick and easy.
What you can do here, is gather data for all the keywords that drive your organic traffic, aggregate it, and see if the trends in search align with the seasonality you are observing in your adjusted historical data. If there is a major discrepancy between the two, you may need to dig deeper into why or shy away from accounting for it in your forecast.
Events
This one should be straightforward. If you have big events coming up, find a way to estimate their impact on your organic traffic. Events can be anything from a yearly sale, to a big piece of content being pushed out, or a planned feature on a big media site.
All you have to do here is determine the expected increase in traffic from each event you have planned. This all goes back to digging into your historical data. What typically happens when you have a sale? What's the change in traffic when you launch a huge content piece? If you can get an estimate of this, just add it to the corresponding month when the event will take place.
Once you have this covered, you should have the last piece to a good looking forecast. Now it's time to put it to the test.
Forecast Accuracy
So you have looked into your crystal ball and finally made your predictions, but what do you do now? Well the process of forecasting is a cycle and you now need to measure the accuracy of your predictions. Once you have the actuals to compare to your forecast, you can measure your forecast accuracy and use this to determine whether your current forecasting model is working.
There is a basic formula you can use to compare your forecast to your actual results, which is the mean absolute percent error (MAPE):
This formula requires you to calculate the mean of the absolute percent error for each time period, giving you your forecast accuracy for the total given forecast period.
Additionally, you will want to analyze your forecast accuracy for just a single period if your forecast accuracy is low. Looking at the percent error month to month will allow you to pin point where the largest error in your forecast is and help you determine the root of the problem.
Keep in mind that accuracy is crucial if organic traffic is a powerful source of product revenue for your business. This is where exceeding expectations can be a bad thing. If you exceed forecast, this can result in stock outs on products and a loss in potential revenue.
Consider the typical online consumer, do you think they will wait to purchase your product on your site if they can find it somewhere else? Online shoppers want immediate results, so making sure you can fulfil their order makes for better customer service and less bounces on product pages (which can affect rank as we know).
Top result for this query is out of stock, which will not help maintain that position in the long term.
Now this doesn't mean you should over forecast. There is a price to pay on both ends of the spectrum. Inflating your forecast means you could be bringing in excess inventory as it ties to product expectations. This can bring in unnecessary inventory expenses such as increased storage costs and tie up cash flow until the excess product is shipped. And dependent on product life cycles, continuing this practice can lead to an abundance of obsolete product and huge financial problems.
So once you have measured your forecast to actuals and considered the above, you can repeat the process more accurately and refine your forecast! Well this concludes our crash course in forecasting and how to apply it to organic traffic. So what are you waiting for? Start forecasting!
Oh and here is a little treat to get you started.
Are you telling me you built a time machine...in Excel?
Well no, Excel can't help you time travel, but it can help you forecast. The way I see it, if you're gonna build a forecast in Excel, why not do it in style?
I decided that your brain has probably gone to mush by now, so I am going to help you on your way to forecasting until the end of days. I am providing a stylish little excel template that has several features, but I warn you it doesn't do all the work.
It's nothing to spectacular, but this template will put you on your way to analyzing your historical data and building your forecast. Forecasting isn't an exact science, so naturally you need to do some work and make the call on what needs to be added or subtracted to the data.
What this excel template provides:
- The ability to plug in the last two years of monthly organic traffic data and see a number of statistical calculations that will allow you to quickly analyze your historical data.
- Provides you with the frequency distribution of your data.
- Highlights the data points that are more than a standard deviation from the mean.
- Provides you with some metrics we discussed (mean, growth rate, standard deviation, etc).
Oh wait there's more?
Yes. Yes. Yes. This simple tool will graph your historical and forecast data, provide you with a base forecast, and a place to easily add anything you need to account for in the forecast. Lastly, for those who don't have revenue data tied to Analytics, it provides you with a place to add your AOV and Average Conversion Rate to estimate future organic revenue as well. Now go have some fun with it.
________________________________________________________________________________________
Obviously we can't cover everything you need to know about forecasting in a single blog post. That goes both from a strategic and mathematical standpoint. So let me know what you think, what I missed, or if there are any points or tools that you think are applicable for the typical marketer to add to their skillset and spend some time learning.
Wow, what a great YouMoz post! This helps me out with a challenge I've been facing in my work for a long time: how to best put together a forecast with any shred of accountability. And building out the Excel template is icing on the cake.
Amazing - thanks for giving so much back to the community, Dan!
Thanks Jon! Glad to be of service and always happy to help out the community and you.
I'm going to go get a mathematics degree real quick and then I'll get back to you.
Hey Clayburn, I bet you could come close to doing that with some of the amazing resources out there on the web. Academic Earth has a few recordings of intro classes at Berkeley here.
Khan's Academy has an awesome set of video courses on probability and descriptive statistics. After a couple weeks of lunch breaks dedicated to watching these, you will be back in no time.
Hello everyone! After some input and discussions from others, I wanted to let you know the tool has been updated to add some optional fields for events. The instructions cover how you can use these in your forecast.
It also adds a separate section for where you can adjust your history. This way you can see the original historical data and see that the standard deviations don't change as you adjust data.
Great article, Dan. The only disappointment is that you didn't incorporate a pivot table. Next time.
Dan, you seem to be the man for putting great analysis templates together...Much of this is beyond me since I'm such a small business. However, the seasonality trends and outliers are definitely hitting home with me since I've just made some major changes for my site. I need to really go back and analyze the data before to now. Many thanks for getting me going!
Thanks Ginny. I am happy to get you going on that deep dive into seasonal trends and outliers!
Another couple helpful tools for that is always Google Trends and Google Correlate. While we typically use them to see if keyword volumes are trending up, they can also be used to validate whether past seasonal trends you might come across are in fact real and should be accounted for.
Mmmm Google Trends and insights are my favorites, I do daily follow trends, If I start a new team, surely it will be based on Google trends.
Did not know about either or those...will check them out!
Hey I checked Google Correlate, I think it may be in experimental basis
Its results are funny
When i checked a term "attorneys" the Google correlate give results like "Obama shirts" and loan requirement etc.
Brahmadas,
Google Correlate is going to deliver keyword phrases that most closely correlate with the phrase you enter. However, this does not mean that they will necessarily be relevant. The old phrase "correlation is not causation" applies here.
The results you get can sometimes be funny as it's only factoring in how all searches correlate to your query based on search data but not relevancy. It's up to you to interpret and validate the results to determine whether the causality exists between your query and the results.
You might want to try the weekly or monthly time series options in Correlate the next time you try it. Those options tend to deliver less funny results.
I must admit that I've never seen Back to the Future, but all was not lost on me. Great work, this is a potent tool for proving your case when trying to get budget at your company, or earn a new client.
And the Wal-Mart example is spot on. Although I try to not be seen in a Wal-Mart within a 30 mile radius of my house, I always compare their prices to Amazon online, and they are out of stock like 30% of the time. Maybe they need to hire some serious forecasters.
It's concerning that you haven't seen of the best movies of our generation. However, I'm glad you managed to understand it underneath all my references!
Wal-Mart is a serious offender and I catch things like that all the time in SERPs. It's amazing how Google hasn't figured out how to drop pages ranking more quickly just based on stock yet. All in all the lesson learned is I guess everyone needs more serious forecasters haha.
Dan,
Yes, I am new here, but that doesn't say anything. Perhaps I have found a new home? We aren't experts at SEO so maybe we'll up our game from here?
You have taken this personally and overreacted as I see you are putting a link up to our competition, with sarcasm laced and a dismissal that marketers can't deal with anything other than this.
The approach of teaching people unfamaliar with complicated statistical approaches is the simple, tried and true approach at the US Universities. They start simple and then as you get in two advanced classes they start trotting out more complex methodologies. The problem is with time series analysis, simple methods are nearly always a disaster due to the necessary complexities. The KISS approach is dangerous. Let me explain....
You could have responded that you would look into my recommendations, but you have chosen to dismiss this as being "not consequential". Your stance that the users of SEO would benefit from your simple approaches. I argue that they could be worse off as you can make decisions from bad analysis and so to speak "put an innocent man to death" or otherwise know as TYPE II error in statistics. Since these forecasts are driving the financial forecasts they better be as good as you can get!
I downloaded your Excel tool and took a look at it. The tool is set up to use only 24 months of data. Sometimes you are limited with what history we are given, but we do recommend 3 or 4 years of data in order to get good reads on seasonality and trend to avoid false positives. If you are stuck with only 24 months of data, you can use similar techniques like you are using by assuming that there is seasonality and that sometimes does need to happen.
The time series that you have in the Excel tool looks contrived as it has very even numbers in that they end with 000s making it seem like the data was not real and lowers the bar for this discussion as the data seem to be seasonal and then adjusted to create a story. For example, the first three months of each year are almost identical while the other months had some larger adjustments. We simulate data using models so we have spent some time with this area.Textbooks are littered with examples that "work" instead of nasty real time series that exist in companies servers. You have two comments in your Excel tool where you point out that certain months were unusual (June 2011 and and April 2012). You adjusted one of the months, but not the other which is curious. We reversed your stated adjustment to June 2011 back to the original amount and ran Autobox in automatic mode and it found both months(6 and 16) as pulse outliers and created two causal variables to adjust the history for their impact. The forecast is a simple flat line of 16,472 for the next 12 months. The June outlier impact was identified as 9,128 vs your adjustment of 12,000 which itself an educated guess so no way to pin this down as that is what outliers are. Of course, this data is contrived so not the best example. The model built looks as follows: It is a simple and two pulse outliers. It's nice to have a system tell you about the outliers instead of you trying to guess them.
Y(T) = 16472. +[X1(T)][(+ 9128.4 )] :PULSE 6 +[X2(T)][(- 7771.6 )] :PULSE 16
We stress that if you have knowledge about the system you should provide causal variables to help explain the variability in the history and then help your forecast get adjusted for future impacts. The two months where you had knowledge could be applied as two causal variables in your tool. You could fill the column with 0's and a 1 when the month had the impact. Now, the key is to leverage the large spike from Infographic in June 2011 and to place a "1" in the future when you expect that event to happen again and have the forecast automatically adjusted for the lift as had been seen in June 2011.
Modeling and forecasting is a process. You need to identify a model that explains the historical data like a pair of glasses customized for you. Your tool ASSUMES a model instead of building one. I will point to the work of Box-Jenkins as they laid out a process to do this. FYI: Smart Forecasts assumes a list of models and one of them is going to be your model with no specialized customization. Kind of like borrowing your friends glasses and trying to look out of them when you were in elementary school
Let's use a driving analogy. By using Causal variables, you are using the road signs you see out of your windshield in front of you as well as the learnings from the rear view mirror as to what happened in the history in order to generate the forecast.
Tom,
Sure there was a little sarcasm in my comment. A little sarcasm never hurt anyone. I only did so and made that observation about your account because I thought your initial comment wasn't very constructive (the questions you raised were valid of course).
Typically when someone writes an article on SEOmoz, readers take the time to leave more constructive feedback and approach the author in a more pleasant manner, even when they disagree or question their approach. If you originally left this comment I am responding to, I think we would have engaged in a more productive conversation from the start. I am glad though that we are now able to discuss some greater points in detail.
I did say that I appreciated your feedback and the resources, which I took the time to go through the first of. Sorry if that came across as an overreaction or personal offense taken, I just wanted to ensure that we were having the right conversation since you just joined the community and left several questions without diving into it further. With discussions on the internet I suppose sarcasm may be the only thing that comes across clearly (joke).
I pointed to Smart Forecasts because I already mentioned your system and Smart Forecasts is the system that I used in my prior work, which worked well for my prior company. In no way did I mean to say one is better than the other, it was just an example and relevant resource.
I completely agree with you that these simple methods could do more damage if they were incorporated into forecasts which drive demand planning or finances. However, I am more so looking to help people advance their ability to make more accurate projections for goal setting, business development, and budget setting in SEO. I am not saying that these methods are the only or right solution, but if no analysis similar to this is being done currently, then it should provide useful and be a starting point.
From my experience and discussions with colleagues, forecasts given for these purposes are either basic estimates or made based on the general performance of past campaigns on other domains. In some cases, they aren't provided at all.
With that in mind, I still believe these simple practices can improve these types of projections, given people continue to refine their process and methods for deriving their forecasts. My goal here is to help solve an issue that I believe may be commonplace for those who don't have forecasting systems or the resources to create more advanced projections.
I understand there are consequences of simpler methods such as increasing the risk of TYPE II errors. I still think that completing some form of analysis in this way prior to projecting your traffic is better than not, especially if you are consistently missing your mark with arbitrary or even simpler projections. Which is something that I think some people do in the SEO world.
These practices aren't for everyone of course. If a marketer reading this is capable of doing more advanced modeling or works at large company where this process is outdated, then I would imagine they would disregard this and keep doing what they are doing. So in no way am I trying to dismiss marketers' capabilities to handle anything other than this. Rather, I'm directing this article to an audience where this might be something they consider, given their current skill level.
In terms of the excel doc, sure the data isn't real. If I could provide real data, I would be more than happy to. Unfortunately, I am unable to as it would raise data privacy issues with clients that I manage. It definitely isn't a spectacular tool, which I forewarned readers about, but it does it's job - which is to give readers a way to get started and visualize how one might approach this process. The doctored data is there simply to provide an example so that people can understand how to use it.
I would typically use more data than two years, but given that organic traffic can have large variance in a short matter of time, it would skew the forecast even if it provided increased seasonality confidence. With organic traffic, your traffic is decided by the keywords you rank for within a search engine. Unfortunately, search engines update their algorithms frequently throughout the year causing these rankings to fluctuate.
With many of sites I have analyzed, historical data that goes further back then two years can become irrelevant because there have been so many shifts in what keywords they rank for and their positioning. Also given the simple model used to calculate forecast, I cannot provide more data and add more weight to the recent historical data. This is why I provide the YOY seasonal variance and suggest using other data provided by search engines (see Google Adwords and Google Trends) to validate the occurence of seasonality.
For the events that are marked in the document, it's purpose was only meant to show how you might document your adjustments and annotate unusual occurences. So I agree that April 2012 should have been adjusted if I was developing a real forecast. I appreciate that you interpretted it in that sense, and I'm going to make an appropriate adjustment to the doc for future users.
In regards to the June outlier (despite the contrived data), you mention that my adjustment of 12,000 is an educated guess, while the system more appropriately adjusts this data point by 9,128. The 12,000 isn't an educated guess. It's the exact amount of visits brought to a website's infographic through organic traffic. Through our analytics systems (Google Analytics or Omniture), we are able to track this data in detail and would utilize these systems to inform our adjustments. In the article I also use a similar example to convey this practice, although I may not discuss this specific process in detail. So in this case I don't see that the system automatically adjusts any more accurately. I do think it does a pretty good job of coming to a very similar conclusion. So I can see why you prefer the ARIMA methodology, since it reduces room for human error.
I think it's great that you took the time to drop the data into your system and analyze the data. It's definitely nice to have a system that can identify outliers. If you don't have a similar system, what do you recommend doing? I know I mentioned STD dev as a guideline and it's obviously not the best identifier in many cases, but again I was trying to use something most people would get.
I definitely agree that I should of included a place for causal variables. In my mind I assumed that would be considered by just adjusting final forecast numbers for these variables if needed. That may be something that isn't straightforward or could be overlooked. So I'll see to making adjustments for this as well.
I do truly appreciate you taking the time out of your weekend to dive into this indepthly. I think it's great that I can get a concrete opinion of someone well-versed in time series modeling on this guide.
Here is the bigger question of the discussion I suppose. Since you argue that the simple process and analysis can be more devastating because of bad analysis, what is your suggestion on refining projections that are less advanced than this and the analyst doesn't have the resources to use your recommended methods?
P.S. If you truly want to up your SEO game, you definitely came to the right place. Many in this community (myself included) would probably be more than happy to help.
It's your clicks, Marty! Something's gotta be down about your clicks!
Outliers are a great idea, something I think I've kinda been doing half-heartedly, but it's good to make it more quantifiable, so to speak.
Thanks for this, one of the best YouMoz posts I've seen for a while!
First response: damn, that's a fine looking spread sheet!
Looking forward to plugging some data in and having a play, thank you.
"I decided that your brain has probably gone to mush by now"
Oh, how right you were! I grabbed the download, will continue to try to make some sense of it all. Thanks Marty McForecaster.
SEO Skills..check, Analytics skills check....Math skills...crap. I need to go back in the future to junior year in high school. Great post by the way.
Nice post, I have built a few forecasting excel sheets over the years, so some very nice information here. Thanks for writing this for the community mate =)
Nice post and I am looking forward to putting some Organic Search Forecasts together for review. Perfect timing have a budget meeting and forecasting future Organic growth is a great way to justify budget.
Thanks, glad I can help at the right time!
Fantastic post! Skimmed because I couldn't resist and saved to read in detail later :)
Same deal, great idea and I cant wait to get into this deep later tonight. Thanks Dan
Thanks! I'm sure you'll find something actionable to take out of it once take a deeper dive. Feel free to shoot over any questions you have after in the comments. Enjoy!
Great post. Thanks for the spreadsheet, messing around with it now.
Super post Dan, great results from your overall vision. Thanks for sharing
I think I hold many data connected with my previous projects. Like spread sheets, screen shots, etc. Most of them were recorded for the reporting purpose. I think I have spent much more time for preparing those reports and supporting files when I was a website analyst in my previous SEO company, I think now those data can be effectively utilized.
BIG POST,
DAN
Great consequences from your generally visualization. Thanks 4 sharing !!
Interesting! I wish my forecast comes true
In gathering historical data for forecasting organic ranking, in what unit is this data, years, months or days?
I'm not sure if you are referencing the tool, if so, it's monthly organic visits. If you are talking generally about best practices for gathering historical data, I suggest using monthly data. It's less volatile than weekly or daily and frequent enough for you to collect enough data to start defining trends.
frankly... a great post!How accurate have you found the spreadsheet to be in general? have you found its more accurate in some niches than others?
The accuracy of the spreadsheet is really on a case by case basis. Everyone's data and distribution of data is different, so it really depends. Data that fluctuates mores may return less accurate results as it's harder to pinpoint what will happen in the future.
Another major factor is how you approach adjusting your data and incorporating outside knowledge and factors. This will vary from person to person, making it hard to say how accurate it can be.
I don't think certain niches will have more accurate forecasts. However, sites with larger volumes of visitors tend to have more stable traffic and more keywords referring traffic, which will make for a more accurate forecast as a few ranking fluctuations may not have a significant impact.
Additionally, sites with strong branded organic traffic can rely on that segment of the forecast being fairlyaccurate as branded demand will only change significantly with larger marketing initiatives.
To try and manage your expectations in terms of accuracy, I suggest running some tests with historical data. This is a practice I did to ensure my forecast model works and the spreadsheet is reliable from a basic forecasting perspective. The way you would test the accuracy is by trying to forecast a time period that you already have historical data for.
For example, you can plugin 2010 and 2011 monthly traffic to the spreadsheet, make your adjustments, and compare the 2012 forecast that would result with what actually happened in 2012. Based on the accuracy of the forecast to actuals here, you can get an idea of whether your process and this model plays well with your data.
Hope this helps!
Does anybody in this thread perhaps still have a copy of the Excel template that Dan originally posted? I'm not sure that he's still checking the comments here anymore. If so, send it to me at muchobrento{at}gmail
FYI...You could always check the Wayback Machine: https://web.archive.org/web/20150310094759/https://danpeskin.com/organic-forecasting-excel-template/. I was able to download a copy there.
Great post!! I love the back to the future theme... thanks for sharing!
I have read many articles on the subject and this has seemed to me one of the most complete and more understandable than the rest.
Dan-
This post is fantastic. In my role at my current company, I've also been creating future forecasts based on increased budgets, and it has been a HUGE value to our company. But, you're taking it to the next level. Love it!!!
But, the link to your site is currently broken, so I cannot see the template that you have over there.
I hope that get's resolved. Anxiously awaiting it!
Hi Dan, would you mind updating the link to the Excel template? I'd love to check it out. Thanks!
Working on fixing that. Check the link tomorrow or the day after. It should be up again by then!
Hi Dan, the link is giving a redirect loop. Any chance you could update? I would like to take a look and play around with it. Thanks
Interesting concepts here. I think I need to head back to calculus class...
Question: How would you go about projecting a forecast without so much data to begin with? Let's say that you were a startup with only a few months of hard data behind you.
Is it a potentially dangerous and highly inaccurate thing to even try in that case, or is it possible to make a halfway accurate assumption about future trends based on this small amount of data?
That's a tough call. Unless you are very experienced with data analysis/statistics, you may want to avoid making those projects. As you said, it can be potentitally dangerous because your traffic might be volatile in early stages and the less historical data, means increasingly less accurate projections.
For startups it might smarter to set short term goals until you have enough historical data (at least a year) and monitor how the growth trend changes as you go. For example, setting goals for organic traffic just for the next quarter and shifting those projections as you see growth move upward or downward. Another thing you can do is use industry trends for these short term projections. Using Google Trends, Google Correlate, or STAT can help you do this.
Fantastic post!
Dan,
My "rain theme" isn't a good way to start a conversation either. :)
I like a lot of your comments and you clearly have spent some significant time with forecasting and especially your comment "given people continue to refine their process and methods for deriving their forecasts".
You mention that data does exist for more than 24 months, but the data has a lot of "variance". This can cause havoc if you are just using the history and if it is attributable to changes in keywords then using causals might help. I would argue that even if there is a lot of variance using more data can help with overall trend and if the patterns change you can identify these changes through level shifts and parameter change detection (see the work of Gregory Chow) to flag these breakpoints and purge your older data from a different regime.
Datasets can be anonymized by scaling the data. You can multiply all of the data by let's say .7532. The model would not be impacted. The forecast could be transformed back to the original scale by multiplying it by the inverse (1/.7532). It might be fun to dig deeper with real data as it will have all the warts and nasty things that cause havoc and would lead to deeper discussions.
From your description, SEO data analysis requires a lot of things to consider!
Thanks Tom. SEO data is definitely a beast of it's own, which tends to act much differently. I did run a couple tests with real data and found that two years worked the best, but of course it was a couple tests isn't sufficient to conlude that 24 months is the best data set to start.
My expectation is that people will adjust the template as necessary to fit their needs. As you said earlier, it's only assuming a basic model, and with all the factors, people need to create a model that fits their data best - including deciding what data and causal variables to use.
I definitely will take a look at Gregory Chow, that's one I'm not familiar with. I'd love to discuss this more sometime and I definitely plan on continuing to test and refine this process/excel tool to see how well it can work.
The number of times I've tried to explain to clients how my forecasting excel spreadsheets work to end up with the words "please trust me on this one" really should write something like this! (or crib it!) Well done.
"Keep in mind that accuracy is crucial if organic traffic is a powerful source of product revenue for your business."
Under promise and over deliver right? The last thing you want to do is forecast awesome and only get okay because you got your Greek letters mixed up. I think it's always better to err on the side of caution and stay conservative.
Accuracy is more important. You need to get this right; as the OP points out, being cautious on account of lack of confidence in your methods doesn't cut it. The best thing to be do is to present your projections honestly, but tell your client your margin of error. Don't hide data because you want to look good in a year!
Totally agree with you. The under promise and over deliver idea is not the smartest way of thinking if you want long term success. Forecasts should be based on real data and real expectations. Along with that stating all your assumptions and your margin of error is perfect way to make sure things are transparent in the case that your forecast is inaccurate.
I think that you should describe to clients that you can't make accurate forecast without lots of data. I would give them initial forecast, and then update forecast every month in accordance with new data I have.
Coming out of “Commenting sabbatical” to comment on this interesting post! When it comes to forecasting, especially Revenue Forecasting and Planning Marketing Budget, I have always used historical data because of the accuracy. Forecasting Organic Search traffic, I do realize that it is a different beast all together. Conversion has in my opinion a direct effect in SEO, which I know many people will berate me for stating! Conversion in turn is a derivative of attribution. Where do we account for these two metrics in this model? I am only asking because this approach is something I want to investigate a litte! Anyway, thank you very much for a mind-numbing post.
If you have eCommerce setup in your analytics platform and revenue data is populated automatically and segmented by organic traffic, you can plug in those data points into the tool provided and ignore the Organic Revenue Forecast.
On the flip side, if your revenue data is maintained outside of that system, you can use conversion rates from your goal tracking in your analytics platform to estimate how much traffic will convert and then apply an average order value to that converting traffic segment. I actually provide a place for this in the tool, although I don't go into it in the article.
In terms of attribution, it's tough to account for organic conversions when they are assisted conversions. You may be able to roll your multi channel attribution data into your conversion data from direct organic conversions and then recalculate your conversion rate outside of the system.
However, I would recommend that you analyze how consistent your assisted conversions are to make sure their impact should truly be accounted for.
Let me know if that answers your question(s) and glad you enjoyed the post!
Great post - thanks so much for sharing!
Impressive so much data and so interesting, I read the tug, but I wanted to print to reread several times. thanks
Great idea dude, keep it up.
Sorry, but I lost you after I saw the Greek letters :-)
Don't give clients any ideas for something else to provide!:) Who doesn't love that dreaded question - "Well, if my traffic is currently "X", then what do you predict will be my traffic 3 months from now and 6 months from now? I always feel like answering, "well, just tell me what you consider to be the right answer and that is what I will tell you - whatever you want to hear". But in actuality, I always take the high road to not tarnish the industry further and just say "well, there is not predictive results because I cannot predict what Google will do or if it will even be around tomorrow". Never the less, excellent insight.
Stathis Edel
FindMyCompany.com CEO
Hi Stathis,
Thanks for your input! I have to disagree with you though, about this idea: "well, there is not predictive results because I cannot predict what Google will do or if it will even be around tomorrow"
I have confidence that Google will be around for a significant amount of time, given the size of the company and the service it provides to the world. What Google might do in terms of algorithm or product changes is the real question.
I think that way of thinking that is common in the SEO industry, is what keeps us from progressing forward as a more trusted form of marketing. It's also one of the reasons why I wrote this article - to show you that it isn't impossible to predict traffic if you have the right data and have a decent knowledge of math. And if you don't, my hope is that this post encourages people to step up their game so we can bring more to the table as a whole.
Think of a Google update as a natural disaster. Businesses need to forecast their future demand for many of the same reasons mentioned above. However, they don't stop because there is a possibility of a natural disaster (which might temporarily shut down operations and logistics causing them to fall behind on orders for months). They do their best to plan for what they know will happen or can reasonably conclude will happen and then roll with the punches.
So yes your projections will suck at first and possibly many times after that (especially when unforseen events arise). Still, it's no reason to forget doing it. Since this is a process and takes practice, if you keep working at it, you learn from your mistakes and might actually get good at it.
Anyways, just my perspective on things. I know we have tons to do for clients/potential clients, so obviously it's a lofty goal. Hopefully we get there one day. Glad you enjoyed the article!
Thanks for the fresh update for this topic, Right!, excel templates are really one of the things that you must present well.
loved the article i m not time machine expert but good beginning information for me thankshttps://spidercon.com/blog
I am a time series expert. Sorry to rain on the parade as everyone is so glowing of the article, but the methodologies listed above are circa 1960's.
Review this presentation after reading my points and it will fill you in on some background. https://www.autobox.com/pdfs/vegas_ibf_09a.pdf
Outliers - The approach listed above makes a lot of assumptions. What happens when you have an inlier? What happens when you have a change in the pattern where Junes become high and you want to make them an outlier when in fact they are part of a new pattern? What happen if you had a level shift up in the average due to let's say a merger...are the new values outliers or real? Well they are real and how are you going to identify outliers with this new data set with a level shift and your std dev is now messed with? Primitive approaches use std dev to determine outliers whereas modern day approaches use approaches like this https://www.unc.edu/~jbhill/tsay.pdf
Base Forecast - The approach of creating a baseline and then adding in seasonal factors goes back to the 1920's. See slide 39 of the ppt. Using ARIMA modeling, you can do this much more accurately.
Try the series 1,9,1,9,1,9,1,5 and find the outlier using these simple approaches.
Tom Reilly"Bringer of Rain"
Hi Tom,
I realize that you might be a VP for Automatic Forecasting Systems and are definitely an expert in the field. I truly appreciate your input and that you took the time to create an account just to leave this comment as the "Bringer of Rain," but I don't think you quite understand the point of this article.
The point of this article isn't to show readers the best and most current methodologies for forecasting. If that was the case, I would just point them to a great forecasting system. Or maybe I would discuss when to use Holt's or Winters exponential smoothing methods to account for certain seasonal changes. However, I don't believe anything I have written is misleading or an incorrect approach. Sure it might not be advanced, but if you knew the audience here, it's going to help them get started if they take the time to do so.
The point of this article is to teach online marketers the basics of forecasting and how it can be applied to organic search traffic. The audience here isn't the people who will be actually forecasting their company's sales, it's people that may help inform those people on what's impacting sales from an organic search perspective. It's also about teaching marketers basic principles in forecasting to make sure that if their managers or peers request a forecast, they make more precise projections then they are used to.
The point here is to help search marketers start to identify and investigate historical data points and analyze how their search strategies and campaigns are affecting organic traffic and sales. It's not going to be a perfect process and may not be able to account for inliers or level shifts, but it can help marketers make more informed decisions, obtain better budgets, and increase accountability.
In the online marketing world, there are still plenty of people that don't even know about these primitive practices, so I would rather take some simple steps before unleashing the most current and complex methodologies on them.
Again I appreciate your input and the resources you provided. You do raise some valid points in certain scenarios, but given the purpose of this post, it doesn't make sense to address those scenarios. Given this is forecasting, aren't there always plenty of what ifs? People need to start somewhere and then look at more advanced ways at limiting inaccuracies in their projections.
Thanks,
Dan
Thanks, Dan, for your work on this. I don't refer to myself as a time series expert, but I was a professor focused on time series modeling. We dealt with time series that impact weather-dependent business and engineering systems. I was more on the business-applications side, which was often deeply humbling when our tidy theories met the messy realism of business volatility. (Kinda like a golf club design expert shouldn't expect to be (a-hem) on par with Tiger Woods.) Often, we felt at least some success if we were simply able to construct field-verifiable sensitivity analyses. We also learned that in the face of business reality, we didn't discount any methodology, if it helped understand and respond to Brownian-like character of business, well enough to enhance profits. I like your approach for just that reason. Thanks for your effort to put it together.
Thanks Lee. It's definitely challenging to convey a complete process and it's implications in one article, especially when there isn't any one defining approach that will guarantee results.
It's always great to here experiences and input from someone well versed in the subject. So thank you for taking the time to consider my approach and joining the discussion.