No Effort Longtail SEO Revenues, from FindTheBest
In our infographic about the sausage factory that is online journalism, we had a throw away line about how companies were partnering with FindTheBest to auto-generate subdomains full of recycled content. Apparently, a person named Brandon who claims to work for FindTheBest didn't think our information was accurate:
Hi Aaron,
My name is Brandon. I have been with FindTheBest since 2010 (right after our launch), and I am really bummed you posted this Infographic without reaching out to our team. We don't scrape data. We have a 40 person+ product team that works very closely with manufacturers, companies, and professionals to create useful information in a free and fair playing field. We some times use whole government databases, but it takes hundreds-of-thousands of hours to produce this content. We have a product manager that owns up to all the content in their vertical and takes the creation and maintenance very seriously. If you have any questions for them about how a piece of content was created, you should go to our team page and shoot them a email. Users can edit almost any listing, and we spend a ton of time approving or rejecting those edits. We do work with large publishers (something I am really proud of), but we certainly do not publish the same exact content. We allow the publishers to customize and edit the data presentation (look, style, feel) but since the majority of the content we produce is the factual data, it probably does look a little similar. Should we change the data? Should we not share our awesome content with as many users as possible? Not sure I can trust the rest of your "facts", but great graphics!
I thought it was only fair that we aired his view on the main blog.
...but then that got me into doing a bit of research about FindTheBest...
In the past when searching for an issue related to our TV I saw a SERP that looked like this
Those mashed sites were subdomains on trusted sites like VentureBeat & TechCrunch.
Graphically the comparison pages appear appealing, but how strong is the editorial?
How does Find The Best describe their offering?
In a VentureBeat post (a FindTheBest content syndication partner) FTB's CEO Kevin O’Connor was quoted as saying: “‘Human’ is dirty — it’s not scalable.”
Hmm. Is that a counter view to the above claimed 40 person editorial research team? Let's dig in.
Looking at the top listed categories on the homepage of Find The best I counted 497 different verticals. So at 40 people on the editorial team that would mean that each person managed a dozen different verticals (if one doesn't count all the outreach and partnership buildings as part of editorial & one ignores the parallel sites for death records, grave locations, find the coupons, find the company & find the listing).
Google shows that they have indexed 35,000,000 pages from FindTheBest.com, so this would mean each employee has "curated" about 800,000 pages (which is at least 200,000 pages a year over the past 4 years). Assuming they work 200 days a year that means they ensure curation of at least 1,000 "high quality" pages per day (and this is just the stuff in Google's index on the main site...not including the stuff that is yet to be indexed, stuff indexed on 3rd party websites, or stuff indexed on FindTheCompanies.com, FindTheCoupons.com, FindTheListing, FindTheBest.es, FindTheBest.or.kr, or the death records or grave location sites).
Maybe I am still wrong to consider it a bulk scrape job. After all, it is not unreasonable to expect that a single person can edit 5,000 pages of high quality content daily.
Errr....then again...how many pages can you edit in a day?
Where they lost me though was with the "facts" angle. Speaking of not trusting the rest of "facts" ... how crappy is the business information for SEO Book on FindTheBest that mentions that our site launched in 2011, we have $58,000 in sales, and we are a book wholesaler.
I realize I am afforded the opportunity to work for free to fix the errors of the scrape job, but if a page is full of automated incorrect trash then maybe it shouldn't exist in the first place.
I am not saying that all pages on these sites are trash (some may be genuinely helpful), but I know if I automated content to the extent FTB does & then mass email other sites for syndication partnerships on the duplicate content (often full of incorrect information) that Google would have burned it to the ground already. They likely benefit from their CEO having sold DoubleClick to Google in the past & are exempt from the guidelines & editorial discrimination that the independent webmaster must deal with.
One of the ways you can tell if a company really cares about their product is by seeing if they dogfood it themselves.
Out of curiousity, I looked up FindTheBest on their FindTheCompany site.
They double-list themselves and neither profile is filled out.
That is like having 2 sentence of text on your "about us" page surrounded by 3 AdSense blocks. :D
I think they should worry about fixing the grotesque errors before worrying about "sharing with as many people as possible" but maybe I am just old fashioned.
Certainly they took a different approach ... one that I am sure that would get me burned if I tried it. An example sampling of some partner sites...
analytics-software.businessknowhow.com"BusinessKnowHow ended the relationship with find the best as soon as we realized how spammy they were." - Janet Attard- accountants.entrepreneur.com
- acronyms.sciencedaily.com
- alternative-fuel.cleantechnica.com
- antivirus.betanews.com
- apps.edudemic.com
- atvs.agriculture.com
- autopedia.com/TireSchool/
- autos.nydailynews.com
- backup-software.venturebeat.com
- bags.golfdigest.com
- beer.womenshealthmag.com
- best-run-states.247wallst.com
- bestcolleges.collegenews.com
- bikes.cxmagazine.com
- bikes.triathlete.com
- birds.findthelisting.com
- birth-control.shape.com
- brands.goodguide.com
- breast-pumps.parenting.com
- broker-dealers.minyanville.com
- businessschools.college-scholarships.com
- camcorders.techcrunch.com
- cars.pricequotes.com
- cats.petharbor.com
- catskiing.tetongravity.com
- chemical-elements.sciencedaily.com
- comets-astroids.sciencedaily.com
- companies.findthecompany.com
- companies.goodguide.com
- compare-video-editing-software.burnworld.com
- compare.consumerbell.com
- compare.guns.com
- compare.roadcyclinguk.com
- comparemotorbikes.motorbike-search-engine.co.uk
- congressional-lookup.nationaljournal.com
- courses.golfdigest.com
- crm.venturebeat.com
- cyclocross-bikes.cyclingdirt.org
- dealers.gundigest.com
- death-record.com
- debt.humanevents.com
- design-software.underworldmagazines.com
- destination-finder.fishtrack.com
- diet-programs.shape.com
- digital-cameras.techcrunch.com
- dinosaurs.sciencedaily.com
- dirt-bikes.cycleworld.com
- dogbreeds.petmd.com
- dogs.petharbor.com
- donors.csmonitor.com
- e-readers.techcrunch.com
- earmarks.humanevents.com
- earthquakes.sciencedaily.com
- ehr-software.technewsworld.com
- fallacies.sciencedaily.com
- fec-candidates.theblaze.com
- fec-committees.theblaze.com
- federal-debt.nationaljournal.com
- fha-condos.realtor.org
- fha.nuwireinvestor.com
- financial-advisors.minyanville.com
- findthebest.com
- findthebest.motorcycleshows.com
- findthecoupons.com
- findthedata.com
- firms.privateequity.com
- franchises.fastfood.com
- ftb.cebotics.com
- game-consoles.tecca.com
- game-consoles.venturebeat.com
- gin.drinkhacker.com
- golf-courses.bunkershot.com
- gps-navigation.techcrunch.com
- gps-navigation.venturebeat.com
- green-cars.cleantechnica.com
- guns.dailycaller.com
- ham-radio.radiotower.com
- hdtv.techcrunch.com
- hdtv.venturebeat.com
- headphones.techcrunch.com
- headphones.venturebeat.com
- high-chairs.parenting.com
- highest-mountains.sciencedaily.com
- hiv-stats.realclearworld.com
- horsebreeds.petmd.com
- hospital-ratings.lifescript.com
- hr-jobs.findthelistings.com
- inventors.sciencedaily.com
- investment-advisors.minyanville.com
- investment-banks.minyanville.com
- iv-housing.dailynexus.com
- laptops.mobiletechreview.com
- laptops.techcrunch.com
- laptops.venturebeat.com
- lawschool.lawschoolexpert.com
- locategrave.org
- mammography-screening-centers.lifescript.com
- mba-programs.dealbreaker.com
- medigap-policies.findthedata.org
- military-branches.nationaljournal.com
- motorcycles.cycleworld.com
- mountain-bikes.outsideonline.com
- nannies.com
- nobel-prize-winners.sciencedaily.com
- nursing-homes.caregiverlist.com
- nursing-homes.silvercensus.com
- onlinecolleges.collegenews.com
- phones.androidauthority.com
- pickups.agriculture.com
- planets.realclearscience.com
- planets.sciencedaily.com
- plants.backyardgardener.com
- presidential-candidates.theblaze.com
- presidents.nationaljournal.com
- privateschools.parentinginformed.com
- processors.betanews.com
- project-management-software.venturebeat.com
- projectors.techcrunch.com
- pushcarts.golfdigest.com
- recovery-and-reinvestment-act.theblaze.com
- religions.theblaze.com
- reviews.creditcardadvice.com
- saving-accounts.bankingadvice.com
- sb-marinas.noozhawk.com
- sb-nonprofits.noozhawk.com
- scheduling-software.venturebeat.com
- scholarships.savingforcollege.com
- schools.nycprivateschoolsblog.com
- scooters.cycleworld.com
- smartphones.techcrunch.com
- smartphones.venturebeat.com
- solarpanels.motherearthnews.com
- sports-drinks.flotrack.org
- stables.thehorse.com
- state-economic-facts.nationaljournal.com
- steppers.shape.com
- strollers.parenting.com
- supplements.womenshealthmag.com
- tablets.androidauthority.com
- tablets.techcrunch.com
- tablets.venturebeat.com
- tabletsandstuff.com/tablet-comparison-chart
- tallest-buildings.sciencedaily.com
- technology.searchenginewatch.com
- telescopes.universetoday.com
- tequila.proof66.com
- texas-golf-courses.texasoutside.com
- tires.agriculture.com
- tractors.agriculture.com
- tsunamies.sciencedaily.com
- us-hurricanes.sciencedaily.com
- video-cameras.venturebeat.com
- volcanic-eruptions.com
- waterheaters.motherearthnews.com
- wetsuits.swellinfo.com
- whiskey.cocktailenthusiast.com
- whiskey.drinkoftheweek.com
- white-house-visitors.theblaze.com
- wineries.womenshealthmag.com
we have seen search results where a search engine didn't robots.txt something out, or somebody takes a cookie cutter affiliate feed, they just warm it up and slap it out, there is no value add, there is no original content there and they say search results or some comparison shopping sites don't put a lot of work into making it a useful site. They don't add value. - Matt Cutts
That syndication partnership network also explains part of how FTB is able to get so many pages indexed by Google, as each of those syndication sources is linking back at FTB on (what I believe to be) every single page of the subdomains, and many of these subdomains are linked to from sitewide sidebar or footer links on the PR7 & PR8 tech blogs.
And so the PageRank shall flow ;)
Hundreds of thousands of hours (eg 200,000+) for 40 people is 5,000 hours per person. Considering that there are an average of 2,000 hours per work year, this would imply each employee spent 2.5 full years of work on this single aspect of the job. And that is if one ignores the (hundreds of?) millions of content pages on other sites.
How does TechCrunch describe the FTB partnership?
Here’s one reason to be excited: In its own small way, it combats the recent flood of crappy infographics. Most TechCrunch writers hate the infographics that show up in our inboxes— not because infographics have to be terrible, but because they’re often created by firms that are biased, have little expertise in the subject of the infographic, or both, so they pull random data from random sources to make their point.
Get that folks? TechCrunch hosting automated subdomains of syndicated content means less bad infographics. And more cat lives saved. Or something like that.
How does FTB describe this opportunity for publishers?
The gadget comparisons we built for TechCrunch are sticky and interactive resources comprised of thousands of SEO optimized pages. They help over 1 million visitors per month make informed decisions by providing accurate, clear and useful data.
SEO optimized pages? Hmm.
Your comparisons will include thousands of long-tail keywords and question/answer pages to ensure traffic is driven by a number of different search queries. Our proprietary Data Content Platform uses a mesh linking structure that maximizes the amount of pages indexed by search engines. Each month—mainly through organic search—our comparisons add millions of unique visitors to our partner’s websites.
Thousands of long-tail keyord & QnA pages? Mesh linking structure? Hmm.
If we expand the "view more" section at the footer of the page, what do we find?
Holy Batman.
Sorry that font is so small, the text needed reduced multiple sizes in order to fit on my extra large monitor, and then reduced again to fit the width of our blog.
Each listing in a comparison has a number of associated questions created around the data we collect.
For example, we collect data on the battery life of the Apple iPad.
An algorithm creates the question “How long does the Apple iPad tablet battery last?” and answers it
So now we have bots asking themselves questions that they answer themselves & then stuffing that in the index as content?
Yeah, sounds like human-driven editorial.
After all, it's not like there are placeholder tokens on the auto-generated stuff
{parent_field}
Ooops.
Looks like I was wrong on that.
And automated "popular searches" pages? Nice!
As outrageous as the above is, they include undisclosed affiliate links in the content, and provided badge-based "awards" for things like the best casual dating sites, to help build links into their site.
That in turn led to them getting a bunch of porn backlinks.
If you submit an article to an article directory and someone else picks it up & posts it to a sketchy site you are a link spammer responsible for the actions of a third party.
But if you rate the best casual dating sites and get spammy porn links you are wonderful.
Content farming never really goes away. It only becomes more corporate.
Comments
Never, ever f%ck with an actual SEO.
Recently one of my clients who monetizes FTB traffic in a couple categories and actually gets significant revenue from them discovered that hundreds of unique, hand written snippets had been copied from their site and syndicated by FTB to hundreds of sites in their network through both widgets and syndication. When confronted with this fact, FTB followed the Google Kenya model and just blamed a "rogue contractor" and promised to improve their "quality control"
... is fairly minimal in terms of the destructive impact it has on the original source & the person who had to bear true editorial costs to create the original content that was scraped.
I love how all these "at scale" platforms claim to do magic, but when they get caught being shady it is always "a contractor" who did it. The "we lost control over the contractors" part of the messaging never really makes it into the public relations spiels about the magic.
Now a days Panda prefer those links who are coming from relevant site ....
so my question is " How to find relevant directories or Article directories.?There is any specific term or any tools is there so I can search the relevant directories and Article for creat the link building".............
They've got my business listed with the correct phone number, but in Gainesville - 150 miles away.
"This listing is for [My Company Name]'s Single Location in Gainesville, Florida. It primarily operates in the Computer Systems Design and Related Services Companies industry. " They also estimate that I made $140,000 in 2011, and that my business was founded in 2007 - none of which is even close to being the truth.
However for $151.99 they offer a "Comprehensive Insight Report/Comprehensive report on credit, financial stress, and other information."
Although I'd be curious to know what the "other" information is, I think I'd get feeling that I'd get a more accurate report from a palm reader.
Saw the Twitter prequel of this article on your wall yesterday. FindTheBest's argument sounded like another childishly optimistic statement marketing would come up with.
Slap 'em Aaron.
"....are exempt from the guidelines & editorial discrimination that the independent webmaster must deal with."
Well said!
This stuff makes me so mad. We slave away trying to publish real engaging content and garbage like that gets all the Google love. Scraping is a very attractive business model I must say. Why write and research content when we can all just scrape the web and become Adsense millionaires?
...was driven in part by a comment SerpSleuth made about a half year ago:
Thanks for this damning breakdown of FTB. Just goes to show that spam & low quality content when done at scale by big companies still works in spades. Corporate content farming is exactly right.
And Brandon might want to read up on the Streisand Effect.
Speaking of scraping, it seems like Google is increasingly being aggressive with their own scraping, displaying more and more information directly in the SERPs, rather than driving traffic to the sites from which that information was scraped. Larry Page has said himself that he wants to see Google become more of a knowledge engine, providing people with answers to their questions rather than giving them a list of 10 blue links. Recent changes like the knowledge graph and the image search update are a testament to this. Scraping seems like what all the big companies are doing, with the actual content producers being squeezed out of revenue.
Great point there. With Knowledge Graph it seems like more and more information is being pulled in from authoritative websites and used for the benefit of Google, which may or may not benefit the user. But on the topic of FTB: Hahaha! The moon is to small of a goal I suppose, I'd say that big black hole in the universe.
...see the above quoted SerpSleuth comment about skunk farm project.
Google's Andrew Hogue is pretty revealing with some of Google's intent with their Knowledge Graph program. In this video he states: "It's never going to be 100% accurate. We're not even going to claim that it is 100% accurate. We are going to be lucky if we get 70% accuracy ... we are going to provide users with tools to correct the data."
" Scraping seems like what all the big companies are doing, with the actual content producers being squeezed out of revenue."
Valid point. Who is going to actually produce content if there is nothing left to scrape?
...the only difference is going to be the distributions between value creation and value extraction. In Europe Google is already paying what is effectively kickbacks to news & media companies in France & Belgium, as well as to some ISPs like Orange. Smaller players won't enjoy those sorts of kickbacks. They will either need to get big enough to enjoy various other benefits (eg: brand bias) or operate with increasing levels of efficiency to stay in business.
Also keep in mind that Google can pit various media formats against one another: news articles vs web pages (and that can be broken down into things like blogs vs magazines vs forums vs qna sites) vs videos vs social layers vs ebooks. All the while they can keep inserting their own layers like local and shopping and so on, and buy enough complimentary pieces to make a service of average to slightly above average quality with outsized margins due to their immense automated distribution via stuffing the search channel.
That distribution cuts both ways too...it increases Google's margins while making some competitors weaker. And then Google can always partner with competitors 3 through n in a vertical to weaken the lead of anyone who is strong enough that the search channel isn't enough to overwhelm them.
"noindex,nofollow" might be the only way out of the Google Box we are all trapped in...risky and costly decision to make that most will never consider, but if followed by many could force a change?
...and it is not in the index, Google will pay someone to steal it & submit it to the index.
No need to look any further than the illegal movies that keep reappearing on YouTube, though if you do, you will find the same elsewhere, again and again.
Anyone find one of these emails in their inbox recently?
What I love is that the only "reviews" of the products in this gentlemen's "vertical" on FTB are done by himself personally. Gotta start somewhere I guess:
Hey %site_owner_name%,,
My name is Mike LaFirenza, and I work for FindTheBest, an online
comparison platform that hosts a broad spectrum of comparisons raging
from Dog Breeds to Smartphones to %your_vertical_name _here%. We are
a Kleiner Perkins funded company whose network reaches over 20+
million unique visitors per month.
The reason I'm messaging you is because I think that we can help
eachother, and more importantly our users. You have an awesome site
with a lot of great information pertaining to %your_vertical_name _here%.
We have a %your_vertical_name _here% comparison that has a lot of
products in common with the ones you review on your site
(%your_vertical_name _here%.findthebest.com). If you're interested, we can
redesign this comparison tool to fit the skin of your website, and
allow your users to use the tool without leaving your domain.
Furthermore, we can offer a share of the revenue generated from visits
to this page.
Please let me know if you are interested in this. I believe that
having this comparison on your site could help us both reach a broader
audience, and help both of our users more easily find the necessary
resources to make decisions regarding %your_vertical_name _here%.
Hope to hear from you - Thanks!
Mike LaFirenza
/I love how "not leaving your domain" is supposed to mean YOU get the traffic as the partner site...what good is that when you host and can influence none of the content on "your domain"??
...certainly leaves the domain ;)
And, as I showed in the images above, they will partner with multiple competing sites on the same vertical (like laptops). Initially those publisher sites will outrank FTB, but if they get a couple dozen partners all pointing links back to FTB on the same vertical, then eventually FTB will likely outrank those partner sites much of the time.
Explaining the scale before accuracy approach is the need for growth:
Eric Schmidt : "Brands are the solution, not the problem," Mr. Schmidt said. "Brands are how you sort out the cesspool."
FTB is really proving that point for him isn't it.
Would love to see if Matt has an opinion on this, we should twitter bomb him with this link.
I subscribe and read via Owl RSS, and its 1:25 AM in my Lazy-Boy as I read this fantastic post. No one, and I do mean No One, does his homework like you do, boyo. I would rather read your scoop in the latest hypocracy of Google than anything else on the Internet. Great job. I hope Google et al reads your articles and forwards them around internally. But I doubt they do :)
Add new comment