A New Google Filter is Born
In early December some astute webmasters noticed that some of their longterm (in some cases many years) #1 or #2 ranking pages in Google now rank at #6. Just like with the Google -30 and the Google -950 penalties, some people will maintain this is fiction, but too many smart people experienced the same thing at the same time for it to be such.
Background Information
Tedster started a WMW thread on the topic on December 26th. From Tedster's post, some of the sites that were hit:
- Well established site with a long history.
- Long time good rankings for a big search term - usually #1
- Other searches that returned the same url at #1 may also be sent to #6, but not all of them
- Some reports of a #2 result going to #6.
My Site That Got Hit
My site which saw a ranking dive on December 18th had the homepage hit, and interior pages hit for some (but not all) related phrases. Here are some noteworthy conditions with my site that was hit:
- The site was entirely ranked on SEO. There is no ad budget outside of PPC ads or link buying, and no brand recognition outside of the search results. Outside of one linkbait there is nothing remarkable about the site.
- The homepage did not get any new quality links in over a year.
- Much of the link building was done years ago when I was far spammier and far more aggressive with anchor text than I would be today, though I did use some semantic variation to pick up rankings for many different keyword permutations.
- The internal pages still rank #1 for some semi-related longer queries, while they are also filtered and ranking #6 for some more obviously connected shorter search queries.
- The site continues to buy PPC ads and gets decent conversion rates for the keywords that were hit, and gets great conversion rates for more focused related terms, some of which the site was hit for and some of which the site still ranks great for. This conversion data is being sent to Google via the AdWords conversion tracker.
- This affected alternate permutations of acronyms (letters strung together or pulled apart).
- For my site this affects rankings on alternate versions of words (ie: single vs plural). For at least one person on WMW they did not see it affect both single and plural versions of their keywords.
- This affects words if mixed into a different order.
- This affects many longer search query containing the core words or closely related words.
- This did not affect obvious domain name or brand related queries, even if the brand contained one of the words overlapping with the penalized set. If a filtered word outside of the domain name / brand name is appended to the query then the rankings are killed, and the site is stuck at #6.
Usage Data or Improved Phrase Relationship Detection of Anchor Text?
Why I do Not Think it is Usage Data
Based on feedback in the WMW thread it is hard to isolate this to any one variable with certainty. Two possibilities that have been thrown out are rolling more usage data into the search results or a better understanding of word and phrase relationships. It is easy to think of usage data as a possibility given my site's lack of marketing and lack of integration into the organic web, but that would not explain why some pages and queries were hit while some similar pages and queries still rank, with Google getting strong conversion data via AdWords on some of these pages. Also, for that homepage I wrote an aggressive page title and meta description that draws in many clicks, and the landing page is exceptionally relevant for the query.
Why I Think it is Phrase Relationships
I think this issue is likely tied to a stagnant link profile with a too tightly aligned anchor text profile, with the anchor text being overly-optimized when compared against competing sites.
The fact that some related queries were hit, but not all, makes me think that rather than being about usage data this is about word and phrase relationship improvements. I think if Google got better at understanding word relationships, many of the pages that once fit the criteria to rank may now have anchor text that is too focused and too well aligned with the target keywords, especially if they compare your anchor text to the anchor text of other sites competing for the same phrases. Once possible manipulation is identified via artificial anchor text your rankings across the site can be suppressed for a basket of semantically related terms, as noted in some of Google's phrase based indexing patents.
Matt Cutts Does Not Know What Happened
This filter was also called the minus 5 penalty, but many of the sites that were hit still rank at #6 even if they were ranking #2 or #3 before they were hit. When Barry posted about this Matt Cutts said "Hmm. I'm not aware of anything that would exhibit that sort of behavior," but some past SEO issues, like the famed Google sandbox have been accidentally introduced as a side effect of Google upgrades:
What's a sandbox, Matt?
"Some people have asked, "does this apply to newer sites?" Essentially, the way to think about it is, around 2003 Google switched to a new method of updating its index. Before that we had monthly Google dances. So as a result, new data is always being folded into the index. It's not like there was one pivotal moment when anyone can say, "Hah! This is the change!" In fact, even at different data centers we have different binaries, different algorithms, different types of data always being tested.
"I think a lot of what's perceived as the sandbox is artifacts where, in our indexing, some data may take longer to be computed than other data."
Great Comments About the Filter
3 great posts from the WMW thread:
Your Feedback Needed
With my sample set of one site my current hypothesis might be out to lunch. If you have any sites that you feel were hit and want to share them for helping everyone figure out what is going on please do so in the comments below. If you have any ideas or feedback on what happened please leave a comment with that too.