Two Diametrically Opposed Google Editorial Philosophies
An "Algorithmic" Approach
When it comes to buying links, Google not only fights it with algorithms, but also ran a 5-year long FUD campaign, introduced nofollow as a proprietary filter, encouraged webmasters to rat on each other, and has engineers hunting for paid links. On top of that, Google's link penalties range from subtle to overt.
Google claims that they do not want to police low quality content by trying to judge intent, that doing so would not be scalable enough to solve the problem, & that they need to do it algorithmically. At the same time, Google is willing to manually torch some sites and basically destroy the associated businesses. Talk to enough SEOs and you will find stories of carnage - complete decimation.
Economics Drive Everything
Content farms are driven by economics. Make them unprofitable (rather than funding them) and the problem solves itself - just like Google AdWords does with quality scores. Sure you can show up on AdWords where you don't belong and/or with a crappy scam offer, but you are priced out of the market so losses are guaranteed. Hello $100 clicks!
How many content farms would Google need to manually torch to deter investment in the category? 5? Maybe 10? 20 tops? Does that really require a new algorithmic approach on a web with 10's of millions of websites?
When Google nuked a ton of article banks a few years back the damage was fairly complete and lasted a long time. When Google nuked a ton of web directories a few years back the damage was fairly complete and lasted a long time. These were done in sweeps where on day you would see 50 sites lose their toolbar PageRank & see a swan dive in traffic. Yet content farms are a sacred cow that need an innovated "algorithmic" approach.
One Bad Page? TORCHED
If they feel an outright ban would be too much, then they could even dial the sites down over time if they desired to deter them without immediately killing them. Some bloggers who didn't know any better got torched based on a single blog post:
The Forrester report discusses a recent “sponsored conversation” from Kmart, but I doubt whether mentions that even in that small test, Google found multiple bloggers that violated our quality guidelines and we took corresponding action. Those blogs are not trusted in Google’s algorithms any more.
One post and the ENTIRE SITE got torched.
An Endless Sea of Garbage
How many garbage posts have you seen on content farms?
When you look at garbage content there are hundreds of words on the page screaming "I AM EXPLOITATIVE TRASH." Yet when you look at links they are often embedded inline and there is little context to tell if the link is paid or not, and determine if the link was an organic reference or something that is paid for.
Why is it that Google is comfortable implying intent with links, but must look the other way when it comes to content?
Purchasing Distribution
Media is a game of numbers, and so content companies have various layers of quality they mix in to make it harder for Google to find signal from noise. Yahoo! has fairly solid content in their sports category, but then fluff it out with top 10 lists and such from Associated Content. Now Yahoo! is hoping they can offset lower quality with a higher level of personalization:
The Yahoo platform aims to draw from a user’s declared preferences, search items, social media and other sources to find and highlight the most relevant content, according to the people familiar with the matter. It will be available on Yahoo’s Web site, but is optimized to work as an app on tablets and smartphones, and especially on Google Android and Apple devices, they said.
AOL made a big splash when they bought TechCrunch for $25 million. When AOL's editorial strategy was recently leaked it highlighted how they promoted cross linking their channels to drive SEO strategy. And, since acquisition, TechCrunch has only scaled up on the volume of content they produce. In the last 2 days I have seen 2 advertorials on TechCrunch where the conflicting relationship was only mentioned *after* you read the post. One was a Google employee suggesting Wikipedia needs ads, and the other was some social commerce platform guy promoting the social commerce revolution occurring on Facebook.
Being at the heart of technology is a great source of link equity to funnel around their websites. TechCrunch.com already has over 25% as many unique linking domains as AOL.com does. One of the few areas that is more connected on the social graph than technology is politics. AOL just bought Huffington Post for $315 million. The fusion of political bias, political connections, celebrity contributors, and pushing a guy who promoted (an ultimately empty) promise of hope and change quickly gave the Huffington Post even more link equity than TechCrunch has.
Thus they have the weight to do all the things that good online journalism is known for, like ads so deeply embedded in content you can't tell them apart, off-topic paginated syndicated duplicate content and writing meaningless posts devoid of content based on Google Trends data. As other politically charged mainstream media outlets have shown, you don't need to be factually correct (or even attempt honesty) so long as your bias is consistent.
Ultimately this is where Google's head in the sand approach to content farms backfired. When content farms were isolated websites full of trash Google could have nuked them without much risk. But now that their is a blended approach and content farms are part of public companies backed by politically powerful individuals, Google can't do anything about them. Their hands are tied.
Trends in Journalism
Much like the middle class has been gutted in the United States, Ireland (and pretty much everywhere that is not Iceland) by economic policies that gut the average person to promote banking criminals, we are seeing the same thing happen online to the value of any type of online journalism. As we continue to ask people to do more for less we suffer through a lower quality user experience with more half-content that leaves out the essential bits.
How to build a brick wall:
- step 1: get some bricks
- step 2: stack them in your workplace
- step 3: build the brick wall
The other thing destroying journalism is not only lean farms competing against thick and inefficient organizations for distribution, but also Google pushing to control more distribution via their various data grabs: Youtube video & music, graphical CPA ads in the search results, lead generation ads in the search results, graphic AdSense ads on publisher sites that drive searches into those lead generation funnels, grouping like data from publishers above the organic search results, offering branded navigational aids above the organic search results, acquiring manufacturer data, scraping 3rd party reviews, buying sentiment analysis tools, promoting Google maps everywhere, Google product pages & local review pages, extended ad units, etc. If most growth in journalism is based on SEO & Google is systematically eating the search results, then at some point that bubble will get pricked and there will be plenty of pain to go around.
My guess is that in 3 to 4 years the search results become so full of junk that Google pushes hard to rank chunks of ebooks wrapped in Google ads directly in the search results. Books are already heavily commoditized (it's amazing how much knowledge you can get for $10 or $20), and given that Google already hard-codes their ebooks in the search results, it is not a big jump for them to work on ad deals that pull publishers in. It follows the trend elsewhere "Free Music Can Pay as Well as Paid Music, YouTube Says."
It's Not All Bad
The silver lining there is that if you are the employer your margins may grow, but if you are an employee & are just scraping by on $10 an hour then it increases the importance of doing something on the side to lower your perceived risk & increase your influence. A few years back Marshall Kirkpatrick started out on AOL's content farms. The tips he shared to stand out would be a competitive advantage in almost any vertical outside of technology & politics:
one day Michael Arrington called and hired me at TechCrunch. "You keep beating us to stories," he told me. I was able to do that because I was getting RSS feeds from key vendors in our market delivered by IM and SMS. That's standard practice among tech bloggers now, but at the time no one else was doing it, so I was able to cover lots of news first.
Three big tips from the "becoming a well known writer front" for new writers are...
- if short form junk content is the standard then it is easier to stand out by creating long form well edited content
- it is easier to be a big fish in a small pond than to try to get well known in a saturated area, so it is sometimes better to start working for niche publishers that have a strong spot in a smallish niche
- if you want to target the bigger communities the most important thing to them (and the thing they are most likely to talk about) are themselves
Another benefit to publishers is that as the web becomes more polluted people will become far more likely to pay to access better content and smaller + tighter communities.
Prioritizing User Feedback?
On a Google blog post about web spam they state the following:
Spam reports are prioritized by looking at how much visibility a potentially spammy site has in our search results, in order to help us focus on high-impact sites in a timely manner. For instance, we’re likely to prioritize the investigation of a site that regularly ranks on the first or second page over that of a site that only gets a few search impressions per month.
Given the widely echoed complaints on content farms, it seems Google has a different approach on content farms, especially considering that the top farms are seen by millions of searchers every month.
Implying Intent
If end users can determine when links are paid (with limited context) then why not trust their input on judging the quality of the content as well? The Google Toolbar has a PageRank meter for assessing link authority. Why not add a meter for publisher reputation & content quality? I can hear people saying "people will use it to harm competitors" but I have also seen websites torched in Google because a competitor went on a link buying spree on behalf of their fellow webmaster. At least if someone gives you a bad rating for great content then the content still has a chance to defend its own quality.
With link stuff there is a final opinion and that is it. Not only are particular techniques of varying levels of risk, but THE prescribed analysis of intent depends on who is doing it!
A Google engineer saw an SEO blog about our affiliate program passing link juice and our affiliate links stopped passing weight. (I am an SEO so the appropriate intent is spam). Then something weird happened. A few months later a Google engineer *publicly* stated that affiliate links should count. A few years later Google invested in a start up which turns direct links into affiliate links while hiding the paid compensation in the background. (Since Google is doing it the intent is NOT spam).
Implying Ignorance
Some of the content mills benefit from the benefit-of-doubt. Jason Calacanis lied repeatedly about "experimental pages" and other such nonsense. But when his schemes were highlighted he was offered the benefit of the doubt. eHow also enjoys that benefit of the doubt. It doesn't matter that Demand Media's CEO was the chairman of an SEO consulting company which sold for hundreds of millions of Dollars. What matters is the benefit of the doubt (even if his company flagrantly violates quality guidelines by doing bulk 301 redirects of EXPIRED domains into eHow ... something where a lesser act can put you up for vote on a Google engineer's blog for public lynching).
The algorithm. They say. It has opinions.
What Other Search Engines Are Doing
A Bing engineer accused Google of funding web pollution. Blekko invites end users to report spam in their index, and the first thing end-users wanted booted out was the content mills.
But Google need to be "algorithmic" when the problems are obvious and smack them in the face. And they need to "imply intent" where the problems are less problematic & nowhere near as overt.
Makes sense, almost!
Comments
great article - interesting and new ideas. it's a hard one to police - it also begs the question - should all paid links be frowned upon if the content is clearly sponsored but the blog post is very good content and useful to users? i don't think so
may not be on the ban list, but look what a search for maholo.com brings up on blekko: http://blekko.com/ws/mahalo.com
I'm totally new to SEO.
This site is a damn goldmine for all there is to learn on the topic. I can't imagine having any kind of dependence on SEO or wanting to optimize it and NOT paying attention to your findings.
You're like a Search crusader who's made it his mission to stay on top of what's going down in the trenches of traffic and I'm grateful that I found you Aaron. Keep your finger on the pulse! You're doing great work!
that site was banned from day 1. :)
Even when Jason was wishing Rich good luck on Twitter and Rich responded with a thanks, Mahalo was already banned and was to stay that way. ;)
When Microsoft was the pack lead, the antitrust cases were in abundance, but it is not so for Google now; Google CEO Eric Schmidt in the presidential advisory committee, and Google leading the Net Neturality (what a deceptive name) talks and Google having total monopoly on search (you can technically argue that).. I see the future bringing more bad stuff than just Google's lack of motive for cracking down on the content farms.
You said, "My guess is that in 3 to 4 years the search results become so full of junk..."
If so, where will people search? As a content publisher, if I am aiming for that spot on the horizon, what is it?
It is just that the results will be full of different "stuff" ... you won't be competing only against web pages, but MUCH more of Google scrape/spin/mash game. In the search results you will be competing against content from books, archived news and magazine content, more Youtube, more Google Maps, etc. But there will be larger ad units and more navigational "aids" to misdirect searchers before they even see any "organic" results.
Also there will be a lot of mobile search...a lot of that stuff will be verticalized (like Yelp) but a huge chunk of it will still be Google. The only differences will be how people convert and how much SERP real estate they can see on the smaller screen. That will make even more of the search traffic go through the monetized door (per per click and pay per call).
The reasons sites are treated differently is size. We've seen this from the start. If a small blog does something wrong, they're going to be penalized severely. If a major company does something wrong, Google will ask them nicely to change it and/or give a temporary penalty until they get their way. It's mostly a bluff with large companies, but typically has worked. I understand it from their perspective. No one cares about the unknown car blog not being in the search results but they do care about Ford not being in there. They sound like hypocrites in their webmaster guidelines of course, but I think by now most of us see through their PR garbage.
When it comes to the content farms/mills, I think it's Adsense. I'm normally not "conspiracy" driven, but I can't think of any other logical explanation for allowing some of these sites with zero original content to dominate their search results. So when I run a search for something and see an Answers.com, Mahalo, Livestrong, etc that is just scraping someone else's content, I notice that they are all running Adsense. Scraped sites rarely offer anything of value and are just doorway pages. And the sheer size of some of these are definitely not flying under Google's radar. So what could possibly be the reason they dominate in the search results? They all run Adsense.
It's a second chance for Google to reach a user. They search for something, click an organic result, and get another opportunity to put ads in front of them. With the pages being of such low quality, a searcher has little choice but to click an ad. I definitely could be wrong on this, but it's the one time that Google has had a major flaw in their search results and taken a blind eye to it. They typically don't do that unless there is something to gain from the situation.
So why does Google police links/sites differently? Money. The bigger sites are producing them money through Adsense. The smaller ones aren't.
dude, great post.
Remember that some of the biggest corporate content farms also run third or forth-rate search engines. You'd better believe that Google is afraid of the FTC, so they don't want to get in any situation where other search engines can claim that they are being penalized by the world's dominant competitor.
Personally the thing that bugs me the most is the proliferation of reporting-free news sites, of which the Huffington Post is a prominent (but not the worst) example. When I look at links that go by on Twitter, so many of them are to sites where somebody wrote or stole a paragraph about some news story they read on the New York Times or TechCrunch or some other site that actually paid somebody to research and write the story. It's (probably) not plagiarism because they (probably) wrote the summary themselves and they cite the source, but because they're not adding any value, they're subtracting value. I'd rather just go to the source, which is better written, and where I might be supporting the people who wrote it.
@DottedSean, the role of Adsense in all of this is complicated. I think the #1 one reason you see it on crappy sites is that it's the easiest way to put ads on a small or medium-size sites. With any other ad network you're going to get reviewed by a salesperson and qualified... If you don't have a lot of traffic or if you don't fit the (usually irrational and arbitrary) rules of the network, you don't get in.
Many good sites are supported by Adsense too, but Adsense doesn't pay a whole lot so if you want to pay your expenses and then some, you've really got to keep those expenses low, every way you can, and if having an A.I. team up with a distributed team of Oompa-Loompas is what it takes, that's what it takes.
Wow great post, the time you must put into some of your recent posts is crazy. You seem to have a negative view on the future and it really does seem like that's what the future will present us with :(.
Why do people think that if Google filters algorithmically that this like God did it or something, but, if they hand chop you it is evil? Either way Google is controlling the results. Algorithmic is not a free pass. Algorithmic is not like it is natural and unbiased and can't be questioned. Google designs the algo. It is no less biased than a manual review, it just takes less/different man power.
Besides the algo is slowly crumbling in its effectiveness. Anti-trust and monopoly issues are real. Broad sentiment is swinging against Google. "I have given her all she's got Captain." AT&T watches on with a knowing smile.
I don't think it is all bad...but I often focus on how the game & platform are changing. That means that certain things that are getting shut out are getting covered, but keep in mind that every time they close off one area they likely open up another opportunity elsewhere.
And even if that opportunity is within a vertical Google database, they still need to find a way to score relevancy there & are working with fewer signals on it (so if you figure out how to influence those it can in some cases be easier than trying to rank in the regular global search results).
Aaron, you are always to the point and I really liked this post of yours as always. I have another point to mention here which I also sent as tweets to Matt Cutts. As usual he is ignoring those questions. As we all know, Matt Cutts keep talking about "building good quality content" and "generating some good inlinks towards that content". That's all nice and good but so pinky. In real world, especially with ecommerce companies you barely get links back to your product listings except when people are reselling them on Craigslist or Ebay which are deleted regularly.
Since we are being advised and brain washed to follow this pattern; I am wondering WHY GOOGLE never follows these steps on their own??? Back in good days, they started gmail.com, froogle.com and other domains but they failed miserably because they couldn't build enough on their own. They were not even ranking at the highest spots they wanted for their results. So what did they do? Instead of working hard to build those sites, they gave up on them and pushed everything under www.google.com/xyz
I don't know the legal implications but this seems to me very unfair competition. Twitter team kills themselves to build a reputation for their product, next day Google starts a service called Google Buzz, guess where? www.google.com/buzz same goes without saying for Android, Nexus, Places, Offers, you name it.
If Matt Cutts and Google is so honest about building a GOOD reputation around your honest content and product, they have to start using sub domains or individual domains for their own products. If not, they are just PhD level hypocrites.
With universal search, Google doesn't even need to feed of its domain authority to rank. Rather, they can just program their relevancy algorithms to 'mix in some video results' over here. And they never rolled out universal search until *AFTER* they bought Youtube. And more recently they finally started listing some related competing services near the Youtube listings, but that only came about after about a half-decade and some regulatory scrutiny.
Likewise, they hard-coded rankings for their ebooks in the results the same day they launched the ebooks for sale.
There are lots of other interesting things like that which Google has done. For years and years they didn't disclose AdSense partner payout percentages. And then they were even sending some of that traffic through ad units into a search result. So not only does Google compete in their own ad auction (it's like playing poker with everyone else's cards turned up on the table), but they went so far as creating proprietary ad units where they controlled what % of the click value was attributed to which piece of the funnel...all the while they not only didn't disclose that, but they also were not disclosing even the basic revenue share stats.
If Google really wants to get serious about spam, they need need to start reviewing AdSense sites, thoroughly and aggressively. They also need some kind of re-review program (3 mo, 6 mo, then annual) to make sure the sites stay within the guidelines. It seems to me that if they made a serious effort to reduce the AdSense sites that provide nothing more than regurgitated gibberish, there would be a whole lot less spam in the world. They could then algorithmically handle the rest, and they would probably be more accurate (and the search results would improve).
They way they are headed, though, Google could just look for sites running AdSense and knock all of those out of the SERPs. The good sites that get knocked out with the spam would be probably fall within their usual rates of tolerance for collateral damage. :-)
As an aside, is it me or does it look like Google News is taking slightly longer snippets of text all of a sudden? Maybe it's just a formatting thing (or a need for coffee this morning). Considering how many sites are built around scraping Google News search queries, I hope Google has its sh*t together and can identify and credit the original source regardless of whether links are included.
nice, we don't think about such things like with the perspective mentioned in the above post. that was great actually !
Add new comment