We’ve all seen it. Within otherwise peaceful groups, there can arise certain differences of opinion that are instantly polarizing: Coke or Pepsi. The oxford comma. How to pronounce “gif.”

In the SEO world we have quite a few debates like that, and one of our most hot-button issues is duplicate content.

Clients often ask, “Is duplicate content something we really need to worry about?”

As with so many things in life, the answer isn’t black and white. While duplicate content isn’t as bad as some shady SEO practitioners make it out to be, it’s certainly not as good as those selling duplicate content claim—and rarely is it part of a great businesses marketing strategy.

What is Duplicate Content?

You’d think defining duplicate content would be pretty straightforward. Instead, the answer we get from Google is like that time your ninth grade English teacher said plagiarism could technically be any two words in the same order. Sure, it’s an explanation, but now you’re even more bewildered than before.

According to Google, “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”

As someone who once said that Tom Hanks and Nicolas Cage were “appreciably similar” and accidentally started an hours-long debate, Google’s definition concerns me.

Most SEOs agree that duplicate content could potentially be defined as any of the following:

  • An article or page that is a word-for-word copy of another article or page that exists elsewhere on the internet
  • An article or page that shares at least 70 percent of its content, word-for-word, with another article or page found elsewhere on the internet
  • A syndicated article from the original source that lacks a link back to the original source indicating where the syndication came from

But duplicate content doesn’t only come from people getting lazy and copy-pasting stuff.

Other sources of duplicate content include:

  • Back-end technical issues that cause a site to accidentally render page contents on multiple URLs, instead of just one
  • Purchasing copy from a vendor who sells that same copy to their other clients
  • An improperly coded syndicated article or improperly generated dynamic aspect of a website

To put it simply, duplicate content can occur as a result of technical SEO errors, but it can also originate from human error on the part of writers, developers or marketing managers.

Matt Cutts—the man largely considered to be the mastermind behind Google Search—said that around thirty percent of the internet is duplicate content. Unfortunately, he attempted to qualify his statement by saying that duplicate content is only problematic when it’s “spammy,” without really defining what “spammy” means.

The resulting confusion has been amplified by the fact that Google’s own documentation appears to have flip-flopped on what it considers “duplicate content.” In the most recently published 2017 Search Quality Raters Guidelines from Google, the term “duplicate content” isn’t used at all. Instead, Google says that “copied content” may be penalized algorithmically or manually. However, in the Help section of the Google Search Console, the term “duplicate content” is still used and defined.

How does Google Handle Sites with Duplicate Content?

Effects from duplicate content can range from mild to severe, depending on how the content is interpreted by Google. (Outcomes are also influenced by the person who is speaking on behalf of Google’s algorithm, but more on that later.)  

Bad Outcome Number One: Removal of the Offending Page from Index

Google isn’t in the business of being a jerk to websites. Google actually anticipates that much of the duplicate content that exists is not created maliciously. What Google really cares about is how happy users are with their search results.

Thus, Google ensures that results do not contain the same article over and over. In an effort to prevent someone researching a topic and getting five identical results, Google may simply remove articles it deems redundant from its index all together.

Chuck Norris and Google don’t cheat death. They win fair and square.

 

Being removed isn’t necessarily harmful, nor does it put you at risk for a penalty. What it does do is dilute the amount of comprehensive, valuable content that Google thinks you have. In short, if a page or article can’t be ranked and isn’t benefiting your site’s authority, you’d better have a good reason for creating it.

Bad Outcome Two: Manual Penalties

Penalties are the Big Bad Wolf of the SEO world, but the notion that duplicate content will somehow automatically penalize you is mostly a jargon-inspired urban legend. Oh, jargon! What starts as a way to identify a particular concept can so quickly turn into a word that no one actually understands.

“Manual action” is the term Google uses when they penalize a website’s rankings. In SEO circles, this has become known as a “manual penalty.” What it means is that Google has determined that a site is purposefully acting outside of Google’s webmaster guidelines.

When this happens, Google decides that its algorithm isn’t catching something it should, so Google manually overrides itself to suppress a site in the rankings. The penalty is logged and brought to the attention of the site’s webmaster via their Google Console account.

But then what?

Google Search Console’s Help Center says “In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.”

Despite this scary-sounding edict, penalties from duplicate content are extremely rare and, as hinted in this documentation, are reserved for sites that are doing things that are clearly manipulative, such as stealing high-ranking content from another website and attempting to pass it off as their own.

Manual penalties aren’t the only way a site can drop in the rankings though.

Bad Outcome Number Three: Duplicate Content Gives an Advantage to Competitors

As a competitive person, I love SEO. Google search engine results pages are a clearly ordered list where you know who’s number one, who’s number two and so on. When you do something right, you move up the list. When a competitor does something right, they move up the list.

If Google gives an algorithmic advantage to unique content, that unique content can increase a website’s rankings. What’s key to think about here though is that any increase someone sees means that someone else will decrease.

Understanding Algorithmic Penalties

This very purposeful shifting in rankings based on how Google works leads to what some SEO experts call an algorithmic penalty. Algorithmic penalties most often occur when a site drops significantly in search engine results after Google makes a change to how its algorithm ranks a site.

When do ranking drops become severe enough to be considered a penalty? Some use the term algorithmic penalty to ostensibly mean “diet penalty” or “penalty light,” but for some, an algorithmic penalty can create much longer-lasting damage than a manual one.

You’ve probably heard of Panda penalties. Google’s Panda update boosted content quality standards and caused algorithmic penalties galore.

Google views sites with fresh, unique content in a positive light and gives them an overall boost in rankings. Because of this, the most common problem with duplicate content is that it gives an advantage to your competitors who are producing original content.

It doesn’t stop at the page or article level either. By having more quality content in Google’s index, a competitor will be more likely to rank higher for a variety of terms, especially if their site’s architecture and internal linking are strong.

There’s an important distinction to note here. In SEO terms, your competitors gaining such an advantage is not, by any stretch of the imagination, an official penalty from Google.

Still, duplicate content can, and often does, inherently lead to lower rankings. That’s because Google’s algorithm acts just as it was designed: high-quality, original content will be rewarded. Since that priority guides the entire system, rankings losses incurred as a result of duplicate content are far more common than manual penalties.

What To Do About Duplicate Content

Knowing what we know now, three things become clear: One, content that appears somewhere else on the internet is probably not something you want on your site. Two, Google isn’t setting out to severely punish you the second a sentence exists on your site that exists somewhere else. And three, duplicate content does not actively serve the best interests of a website with regard to ranking online.

Multi-Channel Marketing Conflicts

Doing digital marketing the right way means working strategically across many channels, and sometimes SEO best practices can conflict with otherwise passable ideas. Content that already appears on other websites—the kind provided by content mills or subscription services—will often be suggested as part of a well-intentioned content marketing or social media marketing strategy. But publishing digital content to gain more traffic is best done through ways that don’t put other efforts, like SEO, at risk. In other words, all copy that lives on a website’s domain should probably strive to not be duplicate.

As a digital marketer, I’m committed to only offering advice within Google’s guidelines and with the long-term search success of a website in mind. In reality though, we all know most businesses are teams of people with different needs and initiatives, all vying for the resources and attention they need to succeed. Sometimes, duplicate content happens and can’t be avoided.

In the event you find yourself in a position where you absolutely have to add content that could be construed as duplicate or copied content, use a noindex tag in the head of the page. The tag should look like this:

<meta name=”robots” content=”noindex”>

This isn’t a catch-all or guaranteed workaround, but it’s better than nothing. It ensures that search engine robots won’t consider adding that piece of content to their index.

Maximizing Your Brand’s Power

Zooming out even further, it’s important to consider the overall voice of your brand, which typically extends beyond your digital presence. Taking the time to develop original, thoughtful content is a cornerstone in building your brand’s voice, and ultimately, its reputation.

Trying to be too many things to too many people brings on the haters.

Successful businesses are those that understand how they’re uniquely situated to help their customers. Knowing what makes their business stand out, and carefully crafting messaging that connects that unique value to customers’ problems is what sets the great brands apart from the good ones.

Building a strong brand reputation in the communities your customers belong to will ultimately help you succeed across all marketing channels.