Welcome to GoogleNet!
Hitwise recently mentioned that Google controls over 1/3 of UK web traffic.
With that much usage data, if you were Google, would you use usage data in your relevancy algorithms?
An Army of Google Search Editors
They could easily use algorithms to detect
- sites that they send a lot of traffic to relative to its total traffic (comparing ratios between toolbar data and search traffic)
- sites which have seen a rapid spike in traffic from Google
- sites which people quickly bounce away from (and do not later return to)
- sites which get a lot of traffic from Google but get few navigational queries
and flag anything out of the ordinary for human review. Marissa Mayer stated they have 10,000 reviewers.
Does Your Site Look Good to Google's Relevancy Algorithm?
As the web keeps getting richer and deeper, and Google increasingly uses human review for demoting spam, all the aesthetic things matter:
- domain name
- site design
- content formatting
- branding and public relations
As search evolves so too will spam. Some spam sites will LOOK and FEEL better than most non-spam sites. And so the remote quality raters will be given more data to look at - perhaps eventually even a sample of backlinks or other related data.
False positives will occur - sites and careers built around Google without proper support stilts will crumble. Unless your site is of social significance (you are a big corporation, a non-profit organization, a government institution, an educational institution, a top blogger, an official Google partner, or Youtube/Google house content) then part of the optimization process revolves around not only creating sites that pass a hand review, but also trying to create sites that do not get flagged for review - especially if you are a thin affiliate site.
How do you not get flagged for review?
- Build enough quality signals and direct traffic that your site looks like a real part of the web.
- Build something people keep coming back to.
- Do not make drastic changes to your site unless you are comfortable with it going under review.
How do you pass a review?
Short term I think the aesthetic things matter a lot. Longer term it is best if your site satisfies a few criteria
- exclusive content that people value and keep coming back to (Google loses if they remove the best content from their index)
- a brand that people care about and search for (Google looks dumb if they do not rank your site)
- a meaningful and reliable traffic stream outside of Google (many quality signals may stem from this exposure, which will help keep your overall profile more organic)
- you could cause public relations harm to Google and diminish their brand value in the eyes of thousands of people (removing your site has real opportunity cost)
Usage Data for Algorithmic Site Promotion
Creating Fake User Accounts is Harder Than it Sounds
If usage data was ever used to promote sites, they could look at regional data and help promote sites based on what is popular locally. Searchers reveal their location by IP address and the queries they search for.
The Trusted Few
Google could use a subset of their users when using usage data to affect relevancy (perhaps users with 6 months account history, credit card on file via Google Checkout, and a normal email profile).
Why Usage Data is Tricky
Much of the signal from usage data is likely mirrored by PageRank, so the lift might not be that great until they really refine the technology.
Some tricky parts with promoting sites based on usage data are:
- usage data is quite noisy, and
- it may not favor informational sites over commercial intent the way that PageRank does. That informational bias to the organic search results is a large part of why AdWords is so profitable.
Microsoft recently presented a paper on finding authority pages based on browsing habits.