The Past, Present, and Future of [Content] Farming | Search Engine Journal

Mar 10 2011

The Past, Present, and Future of [Content] Farming

Am I the only that doesn’t have a problem with content farms? Based on the sheer number of sites that engage in this technique (and I’m not just referring to the high profile ones) I’m guessing the answer is no.

However, I’m fairly sure that I’m in the minority when I say that I welcome content farms, auto-content generating bots, site scraping scripts, etc with open arms. Most self-respecting SEOs will gasp and wince, wondering how I can possibly side with all of those hack spammers out there that are making life tough for legitimate SEO practitioners that do things the “right” way.

Some might even read this and automatically put me on their ish list and unfollow me, unsubscribe from my RSS feed, etc. But for those of you that are willing to stick around and hear me out with an open mind I offer up what I believe to be a valid explanation:

Artificial Intelligence.

Yep, I said it. Good old AI. You know, the stuff that’s mostly relegated to science fiction flicks and geeky forums like Slashdot. At this point, you might think that old Hugo has had one caipirinha too many. But please, bear with me.

At its core, Google’s search engine is a primitive form of AI. And by primitive, I mean that it doesn’t yet come close to mimicking human intelligence (although I agree with Danny Sullivan when he asserts that Big G could likely whip humans and give Watson a run for his money on Jeopardy).

That said, if you listen to Google’s lead scientists, the search engine will eventually evolve from being a basic tool for discovering information via keyword queries to an almost living and breathing entity that suggests information based on much deeper semantic and social inputs. In fact, their goal is for Google to suggest information that you didn’t even know you wanted to find or that would be interested in.

Now back to content farming, automatic content generation, scraping, etc.

When viewed through the lens of internet history, all of these techniques are in their infancy. And sure enough, virtually all past and current attempts to execute on these attempts have proven to be clunky, spammy, and moderately useful at best (although I’ll be the first to admit that I’ve found use in an ehow.com article from time to time). Content farming in particular seems to be following a similar pattern to real life food farming in that it started off as a fairly inefficient and disparate effort with few if any large players but has now reached a point where it is dominated by mega entities that produce at a gargantuan scale.

And the fundamental thread that ties all of these techniques together is automation. Automation of content creation. Automation of internal linking architecture. Automation of keyword research and selection. Automation of page and site-level SEO. Automation of reporting and analytics.

And make no mistake about it. There are some very bright minds in the programming world that are hard at work at building smarter and more efficient forms of automation that pump out better and better content. And by “better” I mean content that is both search-engine friendly and aesthetically pleasing and useful to humans.

One might even say that they are working on is a form of AI that will be capable of creating content that is as good or (gasp!) better than the stuff being cranked out by most humans. In fact, I’ve heard from a very honest and well-respected peer that some folks are already cranking out machine generated content that is capable of passing for human. Maybe not New Yorker material, but good enough.

But back to that in a moment.

The reason why I’m a fan of these unpopular (yet clearly effective) techniques is that:

  1. They help Google further refine their algorithms, which is ultimately a win/win situation for the user in me that craves better search results, and a better search experience in general
  2. They make SEO harder, and that puts interactive marketers who can consistently find success despite the shifting sands of search in a very favorable position
  3. There’s a lot that a conventional, white-hat-only SEO can learn from deconstructing the methodologies that the more successful content farms use (e.g. the ones that nobody is talking about but that continue to rake in the search engine traffic)

I’m pretty sure that I’m not the feels this way, particularly about the second bullet point above. For example, Alan Bleiweiss put together an interesting article on what he believes are the main facets of the Google’s latest “Farmer” update. In it, he goes onto mention how clients that took his advice and focused on things like original content/copy and internal linking architecture not only came out of this update unscathed, but in some cases, actually benefited from an increase in traffic.

I agree with virtually all of Alan’s assertions. But what he didn’t mention is that there are a lot alleged “content farms” that also came out unscathed and even gained traffic as a result of the latest algorithm. I know because I’ve seen it first-hand (by analyzing the traffic patterns of networks setup by colleagues of mine who shall remain nameless). How is that possible? Because the very best content “farmers” (both in the publishing and e-commerce sectors) have actually worked hard to apply the very principles that Alan outlined in his article.

In other words, there are content farms out there are challenging the very definition of the term “content farm.”

Meanwhile, Google is hard at work fashioning an engine that:

  1. Maximizes their revenue potential (let’s not kid ourselves)
  2. Rewards the right kinds of content, even if they are a little “farmy”

In the future, I expect the content farm landscape to continue mirroring real life farming trends. What I mean is that there will be a movement to farm locally (focusing on local and even hyper-local search queries) and sustainably (e.g. creating truly unique, useful, and in some cases remarkable content) as well as a return to a smaller scale of production (building out content in tight niches as opposed to catch-alls like Demand Media). In a certain sense, this is already beginning to happen. It just doesn’t make the cover of mainstream publications.

I also believe that some day in the future, machine-generated content will rival what’s created by mere mortals and it will be constructed to appeal to Google’s search machine. In the meantime, wise interactive marketers will avoid getting stuck in philosophical debates about search engine boogiemen and instead focus on ways to push the envelope and strike a balance between truly original, human content and automated, scraped, and otherwise manufactured varieties.

And you know what? As long as the content is useful to searchers, I’m fine with either or.

Written By:

PG

Hugo Guzman | @hugoguzman

Hugo Guzman focuses on online marketing strategy for enterprise brands. He can be found at hugoguzman.com or on Twitter @hugoguzman

More Posts By Hugo Guzman

  • I don’t necessarily have a problem with content farms; I have a problem with the majority of the content they push. If a content farm that is churning out useful, relevant information, by all means churn away. If a human writer (or a machine) can produce 100 quality articles in the span of a few days, why stop them? It’s when quality is sacrificed for quantity that gets me.

  • What is quality? Who am I, or Google for that matter, to say what is or is not a quality article? Your’s is, on one level, a quality blog post, yet when I read it there are words missing from sentences. My definition of great content is different from the next man.
    And there lies the rub. It should not be for any algorithm to determine but should be left to the individual to decide.
    There are apps that allow the user, the final arbiter of quality, the opportunity to determine what, in their view, is not up to scratch. They vote with their app – the website is shown the red card and results no longer include those from that website.
    Conversely, a user can determine that a website meets their quality criteria and bookmark the site. Social bookmarking though is prime territory for the backlinking spammer.
    It is time to stop whining about the results received back after a query. It is time to vote for or against particular web-pages. The ‘farmer’ update (Panda) to the Google algorithm is not the answer. Farms have evolved to satisfy the requirement for immediate answers and, in the main, they satisfy that requirement. As users we have brought them upon ourselves. We should be grateful for all the hard work put in by the share-croppers and fiefs who supply the food that we pay so little for.
    Wake up and support those that are producing your kind of ‘quality’ with a FairTrade policy of voting them up. For those that you feel are scamming the system, vote them down – however you want to do it.

  • The content farms and SEO seems to be a “if you build a better mouse trap, someone builds a better mouse” scenario. Google modifies their algorithm to keep content farms out of the top of the results, and someone then goes and re-engineers the true definition of a content farm.

    As you mentioned, there are content farms that did benefit from the latest algorithm, because they changed the definition of a content farm. Flying under the radar is always a good approach as you attract less attention, but still benefit in the long run.

    With the power of computers increasing at an alarming rate, I believe that one day automated content will rival what humans can produce. Actually, with some of the human written content of today, it wouldn’t take much for a computer to produce better material.

  • I love that last sentence, Paul!

  • A dig the perspective, humagaia. And I’m pretty sure that Google would too.

  • Saying “I like content farms because they help Google refine their algorithm”, is like saying, “I like criminals because they help the police refine their operations”, or “I like people that walk slowly on the pavement, because they help me refine my walking strategy.”

    I agree that an objective definition of ‘what quality is’ is not the responsibility of Google – but then they aren’t required to provide that. They are required to define ‘what quality is – in Google’. Which is what the algo does. If it’s standard of quality doesn’t match yours, you go somewhere else.

    For instance, I go to a restaurant and order a meal. It’s a well-presented meal, but the chef has served (what I consider to be) poor quality meat. His definition of quality doesn’t match mine, so I go to another restaurant from thereon. However, there’s nothing to stop someone else coming in after, finding the quality completely acceptable, and becoming a returning customer.

  • Those are not really parallel analogies, especially the first one, because building a website in a “farm” style is not against the law.

    And if you want to go the biped-analogy route, you’d have to go with something building a running track in that it needs to be built to work with all types of bipeds, ranging from slow walkers to world class sprinters.

  • I wish this will help a lot of people. I will tell my friends to read this. Thanks

  • Hugo,

    I only read past the opening of this article because you’re the author – otherwise, I would have yes, jumped ship on this one 🙂 And I’m glad I stuck around. Not because you refer to one of my articles – instead because you are describing the concept of pushing the bounds of technology for positive results.

    Looking at content farms and believing they’ll ever be able to generate quality content with proper IA implemented is the equivalent of people who saw the first cars not believing we could put people on the moon.

    While it’s not inevitable, with enough time, leverage and willingness, it is quite possible. Heck in my site audits for large sites, ecommerce sites, I routinely provide step by step how-to for automating the vast majority of the SEO – humans don’t have to touch every single page when it comes to the SEO aspects of a site on that scale. Haven’t for years. At least since I came up with several methods, though I doubt I was the first.

    And my methods include the IA issues – that’s already been resolved, though Content farms can forget asking me for input on regarding the IA on sites they churn out crappy content on. 🙂

    It’s the content that’s the most challenging because too much of it requires human emotion, passion, opinion that currently no machine comes anywhere close to. That’s the biggest factor. Next up from there, is the way content from existing sites are re-purposed. If a human isn’t reading it, thinking about it, evaluating how to communicate it from scratch in “their own voice”, it’s not going to be good content. Yet. Because that also requires emotional filters machines can not replicate. Yet.

    And until they do, content farms just need to die. Painfully. But since they don’t feel, I’ll settle for the die part.

  • Thanks for the comment and for stick around, Alan!

    Really well said. AI is definitely in its infancy, but since technology advances on an exponential scale, don’t be surprised to see emotional, passionate, opinionated bot articles in the next decade or so.

blog comments powered by Disqus

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.