The Real Reason Google Doesn't Like Paid Links

Being a (Near) Monopoly is Expensive

The more I think about it the more I realize why Google doesn't like the various flavors of paid links. It has nothing to do with organic search relevancy. The problem is that Google wants to broker all ad deals, and many forms of paid links are more efficient than AdWords is. If that news gets out, AdWords and Google crumble.
DoubleClick was the wrong model until Google bought them. But smart marketers are not trying to waste millions of dollars on overpriced brand ads.

Google Doesn't Sell Social Ads

If you are buying ads on Google you are trying to reach everyone searching for a keyword. If you buy contextual ads you are trusting relevancy matching algorithms. Those used to be the standard, but now there are far more efficient ways to reach early adopters. Social influence is far more important than most people give it credit for.

Content as Ads & Cheap Social Ads

People game Digg, draft stories for specific trusted editors, suggest stories to popular blogs, buy reviews on blogs, create products or ideas with marketing baked in, link nepotistically, etc. There are a lot of cheap and affordable ways to reach early adopters.

Editorial and social relationships have far more value than Google realize, and Matt Cutts's recent outbursts are just a hint at how Google is losing their dominant control over the web. And they deserve to, because...

The Web Doesn't Want to be Controlled

Sure Google likes link baiting today, but that is the next paid link. Google is backing themselves into a corner, destroying each signal of quality they once trusted, until one day the web is a piece of junk or Google is no longer relevant.

Cats and Mice: The Shifting Sea of Search Results

Google can never show the most relevant results for everything. No matter what algorithmic loopholes they close they inadvertently open up others. And anything they trust gets abused by marketers. Cat and Mouse.

  • Search engines trusted page titles and meta descriptions. Marketers stuff them full of keywords. So then search had to move more toward trusting page content. Marketers used hidden text and other similar techniques.

  • Search engines trust links. SEOs buy and sell them and create link farms. Search engines only allow some sites to vote, have some sites pass negative votes, make certain votes count more than others.
  • Search engines place weight on anchor text. SEOs abuse it, so they created filters for too much similar anchor text, and offset those by placing more trust on domain names when they exactly match the search query.
  • Search engines place weight on exact match domain names and domainers start developing nearly 100% automated websites.
  • Too many new sites are spammy so they place weight on older sites. SEOs buy old sites and add content to them.
  • Place more weight on global link authority. Spammers find cross site scripting exploits on .edu domains and media sites start posting lead generation advertisement forms on their sites.
  • Bloggers are too easy to get links from and comment links are easy to spam. Search engines introduce nofollow to stop comments from passing PageRank. Then Matt Cutts pushed nofollow to try to get webmasters to use it on advertisements.
  • Too many people are created automated sites, especially affiliates are creating a large number automated sites. Search engines employ human reviewers, get better at duplicate content detection, and require a minimum link authority on a per page level to keep deep pages indexed.
  • Social news sites are providing a sea of easy link opportunities and low quality information. Too many people are doing linkbait. Perhaps Google may eventually only count so many citations in a given amount of time.

When your site changes in rankings it may not be just because of changes you made or changes in your field, it may also be due to Google trying to balance

  • old sites vs new sites

  • old links vs new links
  • big sites vs small sites
  • fresh content vs well linked pages
  • exact match vs rough phrase match
  • etc

This Content is an Automated Personalized Ad Optimized to Rank for You & Exploit Your Personality Flaws - Enjoy!

Are you lonely, broke, ugly, overweight, tired, depressed, stressed, or looking for the best incest bestiality porn online today?

Featured offer: Click here for an online blissful excursion leading to eternal consumer driven happiness.

Machines optimized for market efficiency and profit don't have ethics, and do not promote businesses that do. How much we will allow ourselves to trust personalization and quality scores?

"They knew they were being lied to, but if lies were consistent enough they defined themselves as a credible alternative to the truth. Emotion ruled almost everything, and lies were driven by emotions that were familiar and supportive, while the truth came with hard edges that cut and bruised. They preferred lies and mood music...." - J G Ballard, Kingdom Come

Google is the Biggest Web Spammer

Andrew Goodman recently posted about SEO industry reputation woes, but the real reason for the problem is the self serving agenda of search engines. Don't underestimate the marketing of the search engines, which outside of their own link buying and selling, generally like to hint at this equation SEO = spam.

People spam everything though - media creating biased news, misquoting interviewees, blending ads in content, ads as content, free votes driving communities, deceptive article titles, spinning numbers from small sample sets, bogus posturing formated as research studies, etc.

Look at how much Google had to clean their PPC ads. Yet we don't associate PPC service providers as people pushing thin content arbitrage sites, fraudulent search engine submission services, and off target cookie stuffing offer spams. Should we?

If spam is hosted by Google, ranked by Google, and displays Google ads, then why the need for outsourcing that fault? Why can't we just call those people Google affiliates and leave it at what it is, Google = spam?

Some people claim that Google is out for the best interest of their users, but why the need for cost per action ads that are only labeled as ads on a scroll over? Ads cloaked as content are what is best for users? In a couple years we will see:

The game is now to manipulate consumers not only to click, but to take some further action. And I don't use the word 'manipulate' arbitrarily. This is about turning the web into one big pile of junk mail, aimed at getting you to sign up, buy, or commit to something that you hadn't necessarily wanted.

Google Checkout Logos on AdSense Ads

When Google introduced their AdSense network they not only created an ad syndication network, but also a way to syndicate the Google brand. At first it was the cute Ads by Gooooooooooooogle stuff. Then they started marketing Google Checkout heavily by offering $10 off coupons. Then they started syndicating flash and video ads for Gmail, then Google Pack, and now they are placing Google Checkout icons in the AdSense advertiser ads.

It's a nice deal for Google that they smart price some of the inventory down to virtually nothing, then buy it off themselves. Given that they have no real competition could you fault them for doing so? Even classier of them to put ads for their own products inside ads that advertisers are paying for. But their marketing is good enough that nobody cares. Who else could do that?

Google AdWords to Show Contextual Ad Location URLs

Jen noticed that Google's Kim Malone announced that in the next couple months AdWords will start displaying content targeted ad locations.

Google AdSense pays most publishers crumbs for their ad space. People who are running AdSense ads are willing to sell ads. And sites that have AdSense ads on them are probably actively managed.

Is there a better way to get a list of relevant pages to acquire links from than to run a content targeted AdSense ad campaign and ping those webmasters?

Google Algorithm Update / Refresh

Not sure if it is correct to call it an algorithm update, but a number of keywords I watch I have seen large authority sites get demoted in favor of smaller niche players with spammy keyword rich backlink profiles. I am seeing things like spammy new(ish) lead generation sites outranking fortune 500s and long standing industry association sites.

This is probably about the first update in a year that I have seen Google do anything major that bucks the trend of placing more and more emphasis on legitimate authoritative domains, although things are still shifting around quite a bit and will probably head back the other direction soon.

What are you seeing?

Update: Thanks for all the great comments below. I think Cygnus summed up the change best so far:

I see a few things that can probably be summed up as one change...the sandbox/trustbox was modified to be less restrictive on age and theme. I'm betting it'll tighten up again, but hopefully just on the theme.

To me, this was their way of tackling the ever-growing .edu spam. A lot of that is gone from some of the SERPs I watch; of course, now I see even more blogspots a few pages into the listings, so who knows how much tweaking they'll do over the next couple of weeks.

View All Your Google Supplemental Index Results

[Update: use this supplemental ratio calculator. Google is selfish and greedy with their data, and broke ALL of the below listed methods because they wanted to make it hard for you to figur out what pages of your site they don't care for. ]

A person by the nickname DigitalAngle left the following tip in a recent comment

If you want to view ONLY your supplemental results you can use this command site:www.yoursite.com *** -sljktf

Why Are Supplemental Results Important?

Pages that are in the supplemental index are placed there because they are trusted less. Since they are crawled less frequently and have less resources diverted toward them, it makes sense that Google does not typically rank these pages as high as pages in the regular search index.

Just how cache date can be used to view the relative health of a page or site, the percent of the site stuck in supplemental results and the types of pages stuck in supplemental results can tell you a lot about information architecture related issues and link equity related issues.

Calculate Your Supplemental Index Ratio:

To get your percentage of supplemental results you would divide your number of supplemental results by your total results count

site:www.yoursite.com *** -sljktf
site:www.yoursite.com

What Does My Supplemental Ratio Mean?

The size of the supplemental index and the pages included in it change as the web grows and Google changes their crawling priorities. It is a moving target, but one that still gives you a clue to the current relative health of your site.

If none of your pages are supplemental then likely you have good information architecture, and can put up many more profitable pages for your given link equity. If some of your pages are supplemental that might be fine as long as those are pages that duplicate other content and/or are generally of lower importance. If many of your key pages are supplemental you may need to look at improving your internal site architecture and/or marketing your site to improve your link equity.

Comparing the size of your site and your supplemental ratio to similar sites in your industry may give you a good grasp on the upside potential of fixing common information architecture related issues on your site, what sites are wasting significant potential, and how much more competitive your marketplace may get if competitors fix their sites.

Google Using Search Engine Scrapers to Improve Search Engine Relevancy

If something ranks and it shouldn't, why not come up with a natural and easy way to demote it? What if Google could come up with a way to allow scrapers to actually improve the quality of the search results? I think they can, and here is how. Non-authoritative content tends to get very few natural links. This means that if it ranks well for competitive queries where bots scrape the search results it will get many links with the exact same anchor text. Real resources that rank well will tend to get some number of self reinforcing unique links with DIFFERENT MIXED anchor text.

If the page was ranking for the query because it was closely aligned with a keyword phrase that was in the page title, internal link structure, and is heavily represented on the page itself that could cause the page to come closer and closer to the threshold of looking spammy as it picks up more and more scraper links, especially if it is not picking up any natural linkage.

How to Protect Yourself:

  • If you tend to get featured on many scraper sites make sure you change your page titles occasionally on your most important and highest paying pages.

  • Write naturally, for humans, and not exclusively for search bots. If you are creating backfill content that leverages a domain's authority score, try to write articles like a newspaper. If you are not sure what that means look at some newspapers. Rather than paying people to write articles optimized for a topic, pay someone else to do it who does not know much about SEO. Tell them to ensure they don't use the same templates for the page titles, meta descriptions, and page headings.
  • Use variation in your headings, page titles, and meta description tags.
  • Filters are applied at different levels depending on domain authority and page level PageRank scores. By gaining more domain authority it should help your site bypass some filters, but that may also cause your site to be looked at with more scrutiny by other types of filters.
  • Make elements of your site modular so you can quickly react to changes. For example, many of my sites use server side includes for the navigation, which allows me to make the navigation more or less aggressive depending on the current search algorithms. Get away with what you can, and if they clamp down on you ease off the position.
  • Get some editorial deep links with mixed anchor text to your most profitable or most important interior pages, especially if they rank well and do not get many natural editorial votes on their own.
  • Be actively involved in participating in your community. If the topical language changes without you then it is hard to stay relevant. If you have some input in how the market is changing that helps keep your mindshare and helps ensure you match your topical language as it shifts.

New Directory, URL, & Keyword Phrase Based Google Filters & Penalties

WebmasterWorld has been running a series of threads about various penalties and filters aligned with specific URLs, keyword phrases, and in some cases maybe even entire directories.

Some Threads:

There is a lot of noise in those threads, but you can put some pieces together from them. One of the best comments is from Joe Sinkwitz:

1. Phrase-based penalties & URL-based penalties; I'm seeing both.
2. On phrase-based penalties, I can look at the allinanchor: for the that KW phrase, find several *.blogspot.com sites, run a copyscape on the site with the phrase-based penalty, and will see these same *.blogspot.com sites listed...scraping my and some of my competitors' content.
3. On URL-based penalties allinanchor: is useless because it seems to practically dump the entire site down to the dregs of the SERPs. Copyscape will still show a large amount of *.blogspot.com scraping though.

Joe has a similar post on his blog, and I covered a similar situation on September 1st of last year in Rotating Page Titles for Anchor Text Variation.

You see a lot more of the auto-gen spam in competitive verticals, and having a few sites that compete for those types of queries helps you see the new penalties, filters, and re-ranked results as they are rolled in.

Google Patents:

Google filed a patent application for Agent Rank, which is aimed at allowing them to associate portions of page content, site content, and cross-site content with individuals of varying degrees of trust. I doubt they have used this much yet, but the fact that they are even considering such a thing should indicate that many other types of penalties, filters, and re-ranking algorithms are already at play.

Some Google patents related to phrases, as pointed out by thegypsy here:

Bill Slawski has a great overview post touching on these patent applications.

Phrase Based Penalties:

Many types of automated and other low quality content creation cause the low quality pages to barely be semantically related to the local language, while other types of spam generation cause low quality pages to be too heavily aligned to the local language. Real content tends to fall within a range of semantic coverage.

Cheap or automated content typically tends to look unnatural, especially when you move beyond comparing words to looking at related phrases.

If a document is too far off in either direction (not enough OR too many related phrases) it could be deemed as not relevant enough to rank, or a potential spam page. Once a document is flagged for one term it could also be flagged for other related terms. If enough pages from a site are flagged a section of the site or a whole site can be flagged for manual review.

URL and Directory Based Penalties:

Would it make sense to prevent a spam page on a good domain for ranking for anything? Would it make sense for some penalties to be directory wide? Absolutely. Many types of cross site scripting errors and authority domain abuses (think rented advertisement folder or other ways to gain access to a trusted site) occur at a directory or subdomain level, and have a common URL footprint. And cheaply produced content also tends to have section wide footprints where only a few words are changed in the page titles across an entire section of a site.

I recently saw an exploit on the W3C. Many other types of automated templated spam leave directory wide footprints, and as Google places more weight on authoritative domains they need to get better at filtering out abuse of that authority. Google would love to be able to penalize things in a specific subdomain or folder without having to nuke that entire domain, so in some cases they probably do, and these filters or penalties probably effect both new domains and more established authoritative domains.

How do You Know When You are Hit?

If you had a page which typically ranked well for a competitive keyword phrase, and you saw that page drop like a rock you might have a problem. Other indications of problems are if you have inferior pages that are ranking where your more authoritative page ranked in the past. For example, lets say you have a single mother home loan page ranking for a query where your home loan page ranked, but no longer does.

Textual Community:

Just like link profiles create communities, so does the type and variety of text on a page.

Search results tend to sample from a variety of interests. With any search query there are assumed common ideas that may be answered by a Google OneBox, related phrase suggestions, or answered based on the mixture of the types of sites shown in the organic search results. For example:

  • how do I _____

  • where do I buy a ____
  • what is like a _____
  • what is the history of ______
  • consumer warnings about ____
  • ______ reviews
  • ______ news
  • can I build a ___
  • etc etc etc

TheWhippinpost had a brilliant comment in a WMW thread:

  • The proximity, ie... the "distance", between each of those technical words, are most likely to be far closer together on the merchants page too (think product specification lists etc...).

  • Tutorial pages will have a higher incidence of "how" and "why" types of words and phrases.
  • Reviews will have more qualitative and experiential types of words ('... I found this to be robust and durable and was pleasantly surprised...').
  • Sales pages similarly have their own (obvious) characteristics.
  • Mass-generated spammy pages that rely on scraping and mashing-up content to avoid dupe filters whilst seeding in the all-important link-text (with "buy" words) etc... should, in theory, stand-out amongst the above, since the spam will likely draw from a mixture of all the above, in the wrong proportions.

Don't forget that Google Base recently changed to require certain fields so they can help further standardize that commercial language the same way they standardized search ads to have 95 characters. Google is also scanning millions of books to learn more about how we use language in different fields.

Pages