The Google Penguin Update: Over-Optimization, Webspam, & High Quality Empty Content Pages
Huge Update
Google recently launched their webspam Penguin update. While they claim it only impacted about 3.1% of search queries, the 3.1% it impacted were largely in the "commercial transactional keywords worth a lot of money" category.
Based on the number of complaints online about it (there is even a petition!) this is likely every bit as large as Panda or the Florida update. A friend also mentioned that shortly after the update WickedFire & TrafficPlanet both had sluggish servers, yet another indication of the impact of the update.
Spam vs OOP
Originally leading up to the update, the update was sold as being about over-optimization. However when it was launched it was given no pet name, but rather given the name of the webspam update. Thus anyone who complained about the update was by definition a spammer.
A day after declaring that the name didn't have any name Google changed positions and called the update the Penguin update.
Why the quick turn around on the naming?
If you smoke a bunch of webmasters & then label them all as spammers, of course they are going to express outrage and look for the edge cases that make you look bad & promote those. One of the first ones out of the gate on that front was a literally blank blogspot blog that was ranking #1 for make money online.
As I joked with Eli, if it is blank then they couldn't have done anything wrong, right? :D
Another site that got nailed by the update was Viagra.com. It has since been fixed, but it is pretty hard for Google to state that the sites that got hit are spam, blend the search ads into the results so much that users can't tell them apart & force Pfizer to buy their own brand to rank. If that condition didn't get fixed quickly I am pretty certain it would lead to lawsuits.
Google also put out a form to collect feedback about the update. They only ever do that if they know they went too far and need to refine it. Or, put another way, if this was the Penguin update then this is GoogleBot:
So Worried About Manipulation That They Manipulate Themselves
When I was a kid I used to collect baseball cards. As the price of pictures from sites like iStockphoto have gone up I recently bought a few cards on eBay (in part for nostalgia & in part to have pictures for some of our blog posts). Yesterday I searched for baseball card holders for mini-cards & in the first page of search results was:
- a big ecommerce site where the review on that product stated that the retail described the quantity as being 10x what you actually get (the same site had other better pages)
- a user-driven aggregator site with a thin affiliate post made years ago & attributed to a site that no longer exists
- a Facebook note that was auto-generated from a feed
- an old blogspot splog
- a broader tag page for a social site
- a Yahoo! Shopping page that was completely empty
That blank Yahoo! Shopping page is also what showed up in Google's cache too. So I am not claiming that they were spamming Google in any way, rather that Google just has bad algorithms when they rank literally blank pages simply because they are on an authoritative domain name.
The SERPs lacked expert blogs, forum discussions, & niche retailers. In short, too much emphasis on domain authority yet again.
Part of the idea of the web was that it could connect supply and demand directly, but an excessive focus on domain authority leads users to have to go through another set of arbitragers. Efforts to squeeze out micro-parasites has led to the creation of macro-parasites (and micro-parasites that ride on the macro-parasite platforms).
SEO-based Business Models
Now more than ever SEO requires threading the needle: being sufficiently aggressive to see results, but not so aggressive that you get clipped for it (and hopefully building enough protection that makes it harder for others to clip you). That requires a tighter integration of the end to end process (tying efforts into analytics & analytics back into efforts) & a willing to view SEO through a broader marketing lens & throwing up a number of hail marry passes that likely won't on their own back out but will give you a lower risk profile when combined with your other stuff.
And your business model is probably far more important than your SEO skill level is. Imagine running a consulting company for a lot of small business customers for a few hundred Dollars a month each, based on stable rankings & then dealing with a tumultuous update that hits a number of them at the same time. And then they see an older (abandoned even) competing site of lower quality with fewer links ranking and they think you are selling them a bag of smoke. These sorts of updates harm the ability to do SEO consulting for anyone who isn't consulting the big brands. Yes many people made it through this update unscathed, but how many of these sorts of updates can one manage to slide through before eventually getting clipped?
The Unknowable Future
As search evolves, invariably anyone who is doing well in the ecosystem will at some point face setbacks. Those may happen due to an algorithm update or an interface change where Google inserts itself in your market. If you never get hit, it means you were only operating at a fraction of your potential. If you consistently get hit, you might be aiming too low. Many trends can be predicted, but the future is unknowable, so set up a safety cushion when things are going well.
This year Google has moved faster than any year in their history (massive link warnings, massive link penalties, tighter integration of Panda & now Penguin) & the rate of change is only accelerating. Go back about 125 years and a candle wick adjuster was cutting edge technology marketed as brand spanking new:
Blekko has a decently competitive search service which they manage to run for only a few million a year. As computers get cheaper & Google collects more data think of all the different data points they will be able to layer into their relevancy algorithms. In some markets Chrome has more marketshare than Internet Explorer does & Android is another deep data source. And they can know what user data to trust most by tracking things like if they have a credit card or phone verified on file & how often they use various services like Gmail or YouTube. Google+ is just icing on the cake.
At the same time, they need to improve. As the search algorithms get better, so do the business models that exploit them:
I asked Kristian Hammond what percentage of news would be written by computers in 15 years. “More than 90 percent.”
There will be many more casualties in that war.
Comments
The worst and most disappointing aspect of this Penguin update is that the search results are not better. It is hard to believe that google is not paying more attention to user behavior statistics. They know so much data about websites in terms of page views, bounce rates, time on site, reviews, conversion rates, etc.
It is beyond me why so many quality sites got hit by this update. I though Google used their logs to determine successful searches. Measuring customers that did a search, went to a site and found what they were looking for, and did not need to visit another site or do another search. These stats need to be weighted into measuring the authority of websites!
Proven quality sites should not be penalized, only have links be devalued.
I personally think a big factor in why quality sites got hit by Penguin has to do with low anchor text variation, or over use of keywords in anchor text. I wrote a blog post about it here:
http://www.seo-services.com/the-google-over-optimization-penalty/general/
I would love to know if you think anchor text variation was a major factor in Penguin or not.
Thank you for another great post Aaron.
well said Aaron. Your review on this is refreshing after reading the other Google drone "SEO editors" rehashing Google's propaganda, they forget who they represent in this industry.
Google used to be in the business of providing relevant *organic* search results.
They are now clearly an advertising company in hot pursuit of as much cash as they can get, because they apparently need it so much.
These new results are clearly not relevant or quality to the user, they are just from authority sites. This of course is a problem to the webmasters of these quality relevant targeted sites.
But are these new irrelevant SERPs such a bad thing for the Goggle? It would have been in the old relevant SERP, user first business model days but in the advertising business model...
What happens with lower quality unstable organic SERP?
-> Ads magically become more relevant, more people click on the ads ( more money for Google)
-> More webmasters spend money on Adwords to desperately try and get the traffic they used to get in organic (more money for Google)
-> More companies shift their marketing spend from the unreliable SEO bucket to the Adwords bucket (again more money for Google)
Is this the end of it? unlikely... Google will not stop their relentless pursuit for billions while the vast majority are left in their wake.
I think that elukew nailed it. Google obviously plans its work carefully, and one can be sure that they will err on the side of ad-driven-revenues whenever they make changes to their search algo. Google pollutes trusted organic search by placing ads in essentially unmarked 'yellow'-ish barely visible boxes at the top of the results, and even in the middle of organic results sometimes.
I can see a point in the future where Google search produces a page one of pure ads. They will be carefully camouflaged so 90% of searchers will not notice, and they will be intermittent, so as not to irritate too many people simultaneously.
Google is a marketing company now. Seling ads is their primary business, and the search algo is their 'special expertise' or profession which brings customers in the door. Google will continue to water down the organic search results until they reach a threshold where Bing or some other smart company equals them in percent of searchers.
The best thing SEO developers can do for our future is to promote other search engines, the ones that produce clean organic results that most accurately satisfy the needs of the searchers. We should always promote about three engines other than Google, to force them to compete for quality. Public awareness of Google alternatives is important for our profession, although I only optimize for Google.
They will fix their Penguin mods. Obviously they overshot and will detune to eliminate the blank pages (Stupid, Google.) But those ads in the mix with organic search results is worse. Google needs to maintain the purity of organic search in order to maintain its hold on search.
It's something that I have always believed - it does not pay for google to produce what the searcher really wants. Getting people to advertise and getting people to click on those ads is what pays. Adwords and Adsense are killing the web IMO. Google frowns on / penalises MFA sites but are willing, it seems, do do the same with their index.
Google is currently trading north of $600/share... let's go buy a couple 100. The point and bottom line (SEO aside just briefly) is that G is responsible and accountable to shareholders... (damn and I thought it was us) shareholders with ALOT at stake. And admittedly strictly antecdotal, since G's IPO it's been headed this way.
===== From IBD =====
But the 12% CPC decline in Q1, which followed an 8% decline in Q4, has attracted attention.
Google CEO Larry Page emphasized in an open letter and again on a conference call Thursday that Google's falling CPC is the result of a long-term strategy to EXTRACT MORE REVENUE (emphasis added) from new sources, including mobile users.
The company is expanding its mobile reach, helped by its Android operating system, and has been tweaking its search algorithm to produce better mobile ad results, Page says.
Still, the drop in CPC the past two quarters represents the first such declines since Q2 2009.
Analysts on Friday said they see Google's investment in mobile winning in the end, even though ads for that market aren't yet valued as highly as desktop ads.
"Once advertisers figure out they get a good ROI (return on investment), they'll come back and say, 'Oh wow, this really works,' and PRICES WILL SHOOT UP (emphasis added)," said Sameet Sinha, an analyst with Riley & Co.
=======
In the end, G's focus (or stakeholder financial pressure excerted on them) isn't really on quality organic results anyway.
Google aren't rewarding quality content and punishing poor content, they're rewarding and punishing sites based on their link profiles. Google have "lost it" over the years in becoming WAY too obsessed with links. Now their new obsession is using links to punish sites. It doesn't matter if a site is truly useful to humans if that site has "dirty links" - those links turn a genuinely useful site into "webspam" purely because of those links. Now it seems punishments - not rewards - determine the SERPs. You rise up merely because the sites that were above you got punished - you don't rise up on merit.
Perhaps these measures will drive up their CPC though - or maybe this is a defining moment with Google where they've just gone too far and webmasters will seek more "grass roots" methods of driving traffic to their sites and Google suddenly stops becoming THE focal point for driving traffic - perhaps that is a healthy thing in the long run.
try searching google.co.uk for keyword "Cialis" number 8 result is french website and number 9 result is someones personal homepage . seems like Google is broke .
The updates to aggressive, it's left a lot of legit sites hurt.
Spam needs do need to be addressed, but not to this extent.
Penguin is actually a very poor algorithm and doesn't even approach what human editing of a directory could achieve. It was also not tested thoroughly on sufficiently large datasets or the mess would not be so evident They would be better off killing free SERPS and making their model completely based on adsense or at least have a manual submission and qualification process. As it stands, they are likely to find themselves in hot water with many complaining of a 'bait' and 'switch' strategy. Petitons are being filled out and they will get sent to the FCC and the European Information commissioner. Even if Google areinnocent it just looks like a strategy that many are saying is designed to improve CTR on adwords. And an increase in CTR on adwords will be a metric which they may get judged by. Google should not even be using links really as anything outside of a website can lead to external abuse. For example, the idea that negative SEO has not been happening in many markets like real estate is just naive.
My own site has been dumped. Regardless that I have always spent lots of time getting links taken down and requesting DMCA notices (most of which have been successfully applied). My only real crime that I can think of is I had two blogspots, a posterous micro blog and a Tumbr micoblog linking to me several times from each. The blogspots were hardly ever touched or updated given my own preference for Tumblr. There was one hubpage and one squiddo page from yeas ago too. It hardly makes me spammer of the century. It is not like they were hidden given 3 of them were joined to my Google+ account and contained different content but on the same subject matter. These have now all been deleted.
Still I find spamming by their definition that I am not responsible for. Getting it removed is not easy.
Anyway. What I really want to say is where did the quality control go and the lack of puvblic relations is astounding. As an ex senior exec in a major corporation It is nothing less than a shambles.
On Google.com in France you can search for Paypal France. The first 3 pages are splattered with Viagra sites from position 4 downwards. Now I think offering Paypal a few free adword discount vouchers is not going to be sufficient. If I were in Paypal, the lawyers would be ringing Google right now for associating my company's good name with what are clearly spammy sites.
Screenshots available on request.
Thanks Aaron, I found your post a bit more helpful than the regurgitated crud they're peddling on Search Engine Watch (at least so far). But I need real insight into how to recover from this. How did this update actually work... what onsite or on page factors were affected? And one more thing... just have to agree with you that I like that line about 3.1% of queries. Ha friggin ha! So, in actual fact that means about 20-25% of queries that are not related to Katy Perry, the Avengers, Kanye West, One Direction, Dark Knight Rises, etc.
...we don't always discuss all that stuff fully transparently publicly. Our main business model is community & information (rather than selling tools). If we were to be fully transparent publicly about such updates this site probably wouldn't have a legitimate business model.
I'm actually surprised at all the negative comments. I was hit hard by the Panda in April of 2011 and unfairly so. I have a website which I have loadable over for 10 years. The entire site was/is quality content and was always written for the end user in mind, not Google. I have gone crazy redesigning the entire site so the "new" Google would like it again. Never changed my original content mind you, just repackaged it. Well with every new panda update things just kept getting worse. It's been a horrible nightmare, until the Penguin. As of April 19th, 2012 when the Penguin hit my traffic doubled. Finally a positive effect from a Google algorithm change. WooHoo. I'm jumping for joy and loving the Penguin! I finally feel Google is properly reading my content as being quality. Very happy with this latest change. Very happy indeed!
That was supposed to say "Labored over for 10 years". Still working on first cup of coffee. All still foggy.
...for every algorithm update there will be some winners and some losers. Some of those will be justified & others not so much.
Did you also notice how eHow started popping up everywhere again in the SERPs? This weekend there was news about Demand Media turning down a $1.2 billion buyout offer & their stock is up nearly 40% in pre-market trading right now.
Thanks Aaron for your insightful post. I have been developing quality sites for over 10 years. We have excellent teams for writing useful and original content and our sites are liked by users - a fact that can be verified by bounce rate and other rankings like Alexa. We had never employed any blackhat link-building techniques. Panda updates did not harm our sites.
However, this is the first time in my life that over 70% of my sites were affected. All the sites affected were the sites with high demand high paying keywords. The sites that were not affected were the new sites or sites with keywords that are not of any significant value. I saw a jump in traffic for about 5% of my websites, but when I checked the statistics, I found that I am getting traffic for keywords that are not at all relevant to my site and I had never targeted those keywords too. And strangely the one page lenses we had made on web 2.0 properties started appearing on 1st or 2nd page of Google - but our 1000 page sites with quality content were pushed to 10th or 20th page.
What's this happening? When I worked as a software project manager, before releasing any product we would thoroughly test the software and look for possible bugs and fix those bugs before releasing the software. Google is employing PhDs from Harvard, Stanford and MIT - and they don't even know this simple principle of a product release? It's a shame on Google engineers and its top management who has become blind in the pursuit of money. So far Google had its success because of people who used their search engines. However, with such poor quality of SERPs which only favor the so called high authority domains - users are going to drift away from Google. I've myself switched to Bing now - it delivers far better results.
Thumbs down to Google - they have to remember that everything that goes up has to come down !!
...they don't mind a short term decline in relevancy if it leads people to invest far less in trying to manipulate them.
I am sure they are well aware of the problems they created with their own results, but I think they likely felt that a bit of short term pain for them to hose people they felt were manipulating them was a worthwhile expense to try to retard much of the related investments.
It sucks to get hit by such updates, but its impressive that they are willing to harm themselves badly just so they could harm others more. Making matters worse though, is that as they tell people to increase quality & investment they keep eating up more of the search results with their own scraped together Google thin affiliate offerings. If Google was doing one or the other then strategy would be pretty easy, but the balancing act is much harder when they do both at the same time.
I have too little knowledge about the subject, perhaps that's what makes my opnion valuable. I commend Google. In my they are aiming to make real content, great content, genuine content. Content that does not take keywords in consideration when being written. Not even one keyword. Content that communicates what needs to be communicated without the sligthest pre-conceived manipulation, without prior keyword research to write a freakin blog post. It seems Google is looking to MAKE CONTENT THE CENTER OF THE WEB AGAIN as it was prior to the invasion of SEO's (much to the blame of Black Hats, but also to the blame of over analytical White Hats). Great content will get you through every update Panda, Penguin, Crocodile, Urchin, or any name they decide to put on it, so that instead of searching for "SEO Agency" to find the right bunch of guys to mange you online you search for "Content Agency"....Moises.
... I mean, I just showed examples of pages with 0 content ranking & you suggest Google is rewarding content? I think links are now as important as ever, its just that one has to be more diverse & measured in their approach.
notice the stellar sites now showing up under "payday loan online" - it seems the new algo, designed at the overoptimizers (many who are not drinking the coolaid, and working for a living) failed to address the major loophole with blog comment spam manipulation, and a new nugget, stuffing thousands of fake ratings on a page, too bad "the penguin" did not start with the easy to find, low hanging fruit
This update doesn't upset me much because while a lot of the longtail searches have become crap as a user, I own a few hundred exact match domains and these have suddenly flooded into Google, even where I've done no real work on the sites. I'm now rushing to get these developed with basic lead gen contact forms so I can make hay while the sun shines.
Curiously, though, at the end of last year I caught a dropped domain with what looked like a decent link profile (PR4) and developed it as a loans site. I did a simple social media profile blast (spammed about a thousand basic Pligg installs for $40) and the site started to show some progress, taking positions on page 3 for money keywords, which was a good start.
Since the Penguin update, the site has tanked and I have an unnatural links warning in webmaster tools. Fair enough, I burned the domain, catch another dropped one and move on.
However, what it has shown me is that it should no longer be hard to Googlebowl a competitor. I figure brand sites might be hard to knock, but most of the SME market is a sitting duck. And for just $40 per domain? Lol!
People have spent over a decade trying to learn how to manipulate Google's rankig algorithm. Now Google have released a new penalty algorithm that looks even easier to manipulate.
Thing is, Google have just opened up a completely new market in Negative SEO that people like me can suddenly join in with. We're living through the biggest economic crisis in a century. Am I tempted? Hell, no, I'm sold! I'm already preparing to sell this service!
I think Google have started a war, one where webmasters are able to take down each other's site. They say in the pills, porn, and casino sectors this is normal business practice, which is why I never touched those areas. I'm not skilled enough to compete. But I should be able to take down general sites by request easy enough if my experience of Pengiun is anything to go by. We'll soon find out with my newest clients. (:
so you would be willing to destroy someone's life for $40?
I get the intention of the Penguin update - get rid of spam. I'm sure it worked in some cases with this update, but for the cases that it didn't, the results are disasterous.
Here's where the Penguin update completely missed its intentions:
A lot of websites have been penalized. Some rightly so and some wrongly so. I don't care to debate how Google can wrongly penalize good sites. Collaterl damage happens. That's not good, but i get it. The disasterous part of it all is that for all the penalized sites, scraper sites that stole the penalized site's content are ranking well, while the original articles are nowhere to be found.
It's happened to one of my sites. Now i'm spending all my time sending dmca requests to isps to get that content removed. Perhaps tvat's what goog wants - get webmasters to knock off the scraper sites with dmca requests because goog can't figure out how to do it themselves.
... has been a core feature of Google's "algorithms" for well over a year now. I mean it had been around for years before that, but with Panda this stuff really came to the fore ... and now they are doing that crap again with Penguin.
Wish you the best of luck with that & sorry to hear it still happens :(
Okay... ALL my authority sites that got dropped by Penguin have been scraped and others are using MY content to rank... WTF? We're talking about 15+ different sites I have verified using my EXACT content and ranking HIGHER than my penalized site.
I verified their DNS was Hostgator and asked HG to take the content down... they said "If it's not copyrighted, nothing we can do".... So... I have to copyright EVERY niche site I build now?.... this f'cking sucks.
4 years of hard work, $500+/day revenue, 20+ happy clients ranking #1 for years, all wiped away and replace by f'cking scrapers. Google must be a bunch of dumbasses. I honestly can't figure out any other way this would've happened.
Joe, that's exactly the situation that this Penguin update created. Penguin is "Another step to reward quality quality SCRAPER sites." it's a nightmare.
Did you send Hostgator an actual DMCA request or just an email? I thought copyright was automatic; you don't have to apply for it.
By default any work that is created is copyright. If he files a DMCA it should sort out the issue.
Links. It's - still - all about links.
The penguin and panda update have made results more natural and have made it almost impossible to force your way up the rankings. It's just a shame that they didn't come sooner, that way black hat SEO-ers would have been forced to reform a lot sooner and unsuspecting SEO clients wouldn't have been penalized somewhat unjustly.
... I prefer those doorway pages & 0 content pages on big authority domains over higher quality independent websites. 10 times out of 10. Perfect really.
Hey Aaron (& seobook-ers),
Migt be of interest to you and your readers, we got hammered too *but also* were able to find out pretty much exactly why from Matt Cuts himself (via an interview the SMH did with him on the topic.. yet to be publsihed over there).
Basically it's nonsense, we've got som idiot pirating our software, linking to us for credibility and thus getting us whacked!!!
More info here: http://wpmu.org/wordpress-penguin-google-matt-cutts/
Hope it's interesting / helpful!
Cheers, James
the fact your site is respected (by terms of social markers) and hammered anyway, illustrates why this is so wrong. The same can be said for a site that is drilled out of SERP existence on purpose, which is happening every day too - even though Matt says it is really hard and expensive to do - it is not either. But the examples I can cite, can't be cited, because of the retribution it would cause - so we need more stories like yours, and people willing to say what is happening. That traffic graphic was simply painful to see - I wish I didn't have any that had similar falls in them, but I do. Good luck in recovery man...here's hoping.
Thanks Marty, it'll be extremely interesting to see what we can do, if anything, about it... even though hope is ebbing somewhat :/
"Thus anyone who complained about the update was by definition a spammer." That's right. Only a spammer waste his energy for complaining. He knows he's powerless and all he can do is cry like a baby. On the other side, a no-spamming marketer understands. He's logical and regularly thinks Penguin can work on his benefit.
...that comment entirely misses the point.
Some of the sites that were hit had been sabotaged by competitors, thus a person who complains about the absurdity & stupidity of such a set up is not a cry baby. Nice try though.
Dear sir
My name is Javid and I launched my site about 8 months ago. (apam.ir)
We started with 30 articles in a day and we have 7thousands of them by now.
Lately (about first of april) I have read in some websites and weblogs that if you make the tags No index you will obtain much better traffic from Google, so with Robot and No index tags I removed our tags.
It was about april 20 that there were no tags from our site in Google anymore.
Then about 2 days later our visit and ranking fell down suddenly.
I also took a few photos that show our decline in visiting in april25 and april26.
And there were no change in this situation for about one month, for instance we publish an article for the first time and after some hours the other websites which copied our article get a better ranking than us and apparently all of our input is from Google images.
We faced a great fell in word searching either.
Now here is the question, what can I do with this?
Is it happened because of making No index all of the tags in my site?
Is it happen because of Google Penguin?
We had about 10 thousands tags that were removed from Google even though it is normal to have so many tags in Persian websites.
I added too our site name in all of titles of the articles.
Our content were like this before:
Title
But I’ve changed it to:
Article Title- Title of Site
And 6-7thousandsa more of these changes
I have sent a revised form but it is not answered yet.
Some weblogs says Matt cutts recommended to launch a new site
How if I set up a new domain and take a back up from my site there?
What is your suggestion for me?
Here is some pictures from us in Google Webmaster Tools
My traffic from Google fell to one third and most of my visits are from Google images.
apam.ir/wp-content/uploads/2012/05/1.png
apam.ir/wp-content/uploads/2012/05/2.png
apam.ir/wp-content/uploads/2012/05/3.png
apam.ir/wp-content/uploads/2012/05/4.png
Yours Sincerely
Javid
apam.ir
Page Rank has given links on the web power to influence the visibility of web content. Everyone on the web is fighting to have their content seen. More links can only lead to more chances people will see your content. Now people are finding that the links they have out there are making their content less visible or practically non-existent in Google, or at least so buried, it might as well be non-existent.
How will the new websites be found on google if they don't use the seo tools?
Sometimes when these algorithmic changes roll out, one of the wisest moves is to be patient and carefully analyze any changes before you react blindly to the latest penalty – because by the time you do that, Google will release the latest Panda or its next iteration of Penguin.
Add new comment