Duplicate Content

Duplicate content (also abbreviated as DC) refers to web content that can be accessed in identical form under different URLs on the Internet.

Duplicate content, also known as “duplicated content,” is content from different websites that is very similar or completely the same. Search engines like Google try to prevent duplicate content and may downgrade websites that use (too much) duplicate content in their index. Especially if there is a suspicion of manipulation (for SEO purposes), pages with copied content can suffer ranking losses or even deindexation.

Why is duplicate content bad?

Search engines evaluate duplicate content as negative, as it does not provide any added value to the user. Nevertheless, every website has to be crawled as well as indexed and thus consumes resources.

Since webmasters often filled websites with duplicate content in the past (also for SEO purposes), Google started to take action against content used more than once. With algorithm changes such as the Panda update, the search engine provider ensured that pages with duplicate content were downgraded in the ranking.

What helps against duplicate content?

Duplicate content usually does not immediately lead to a downgrade by the search engine. However, since there is a risk that duplicate content will be evaluated negatively and will no longer be indexed, website operators should observe some important measures with which they can avoid duplicate content:

301 redirects

A redirect with a 301 code is useful to always lead the search engine and the reader to the desired page and thus skip old content. If, for example, a page is completely replaced by another one – with a different URL – (e.g. in the case of a relaunch), 301 forwarding is a good solution. This way, there are not two pages with identical content, but the visitor is led directly to the second, matching page, even if he selects the URL of one page.

Google sees this redirection as unproblematic. However, to make it as user-friendly as possible, webmasters should only redirect to pages that are an appropriate replacement for the original page.

Ensure the use of correct URLs

To prevent duplicate content, the use of correct URLs is particularly important. Google itself advises, for example, to always pay attention to the consistency of URLs, i.e. to use web addresses consistently. For example, always use only one version: www.example.com/name or www.example.com/name/ or www.example.com/name/index.htm.

Website operators should also use Webmaster Tools to specify the preferred address of a page: http://www.example.com or http://example.com, etc. The canonical tag (see below) can also help here to identify the correct page.

Google also advises using top-level domains to better specify content. For example, webmasters should better use www.beispiel.de instead of URLs such as de.beispiel.com.

Many content management and tracking systems can inadvertently produce duplicate content by rearranging page URLs. By pagination or by creating archives, the CMS may change the URL of a page (for example: example.com/text/022015 instead of example.de/text) and thus the website exists under different URLs. The same applies to (automatically generated) tracking parameters that create a URL snippet that is appended to the original URLs. If the search engine does not detect these snippets correctly, it may recognize the tracking as a new URL and count the page twice. Webmasters and SEO experts should therefore check their CMS and analytics system for these vulnerabilities.

Minimize duplicate content

Website operators should avoid duplicate content as much as possible and produce unique content. On many pages, individual text modules must or should be used redundantly, and occasionally even the duplication of complete pages cannot be ruled out. However, webmasters should limit this as far as possible and, if necessary, point out to the search engine via a link in the HTML code that a page with the same content already exists.

In addition to self-generated duplicate content, it can also happen that other websites produce duplicate content – when a website operator passes on/sells its content to different websites or other websites use the content without permission. In both cases, if the incident is known, website operators should request the operator of the other site to mark copied content with a backlink to the original content or the noindex tag. This way, the search engine can recognize which is the original content and which content it should index.

Use Canonical/href lang/noindex tag or robots.txt disallow

With the help of various tags (in the source code), certain forms of duplicate content can be prevented. The canonical tag in the <head> area, for example, signals Google to index the page to which the tag points. The crawler, however, should neglect the copy of this page (in which the tag is integrated).

The noindex meta tag is used to tell the search engine that it should crawl the page but refrain from indexing it. Unlike the disallow entry in robots.txt, the webmaster thus allows the Googlebot to crawl the page and its content.

Disallow can be used in the robots.txt file to protect entire pages, page types or content types from crawling and thus also from indexing by Google and Co. The robots.txt is a file that regulates which content may be captured by the crawler of a search engine and which may not. Disallow says that the search engine has no access to the defined content.

The href lang tag can be used to indicate to search engines that a page is simply a translation of a domain in another language. For example, if a domain exists under both .co.uk for the UK market and .com for the US market, the href lang tag signals that it is an offshoot of the other page, preventing the search engine from evaluating the pages as duplicate content.
Conclusion:

Duplicate content can become a problem for webmasters and SEO experts, as search engines are reluctant to spend resources on duplicate content. At the same time, Google wants to provide unique content to its users. As a result, DC can be considered negative and, in the worst case, the page can be downgraded in the ranking or, if manipulation is suspected, even deindexed. Website operators have various options to prevent or eliminate duplicate content – including clean redirects, tags in the source code and unique texts.

SEO-Content ✔️ Blog Content ✔️ SEO Content Writing ✔️ Article Writing ✔️ E-Books ✔️

Recipe for success for international SEO texts

Recipe for success for international SEO texts Why do I need international SEO texts? The internationalization of one's own business holds great potential. Especially for industries that are subject to high competitive pressure in the German market, foreign countries...

White Hat SEO

White Hat SEO White Hat SEO is the term used to describe all search engine optimization measures that are based on the rules of the search engines. The opposite of White Hat SEO is Black Hat SEO. In online and search engine marketing, it is important to avoid methods...

Thin Content

Thin Content Thin content is the term for "thin" digital content that offers the user little or no added value. Officially, Google rates websites as irrelevant and thus as thin content if they do not meet the requirements of the Webmaster Guidelines. The term Thin...

Search Engine Advertising (SEA)

Search Engine Advertising (SEA) Search engine advertising (SEA) is a subsection of search engine marketing and is therefore part of online marketing. Search engine advertising covers the area of paid advertisements that are primarily displayed on the results pages of...

Snackable Content

What is Snackable Content? Snackable content refers to texts, videos, images or other content that is particularly easy to consume. Due to its simple design, this type of content is usually used for entertainment purposes. Snackable is based on the English term snack,...

SERP

What does SERP mean? SERP is the abbreviation of Search Engine Result Page and refers to the pages where search results are listed in search engines like Google. Ranking in SERPs is essential to the success of websites. Website operators want their offers to be...

Semantic Search

Semantic Search Semantic search is a method by which the algorithms of search engines such as Google can draw conclusions about the user's intentions or objectives. To do this, they place the components of the query in context with one another and analyze the...

Seeding

Seeding Seeding is the planned distribution of content on the Internet. To do this, the content creator contacts thematically relevant influencers in a targeted manner in order to persuade them to distribute his content further. The influencers use their own networks...

Search Term

Search Term In online marketing, "search term" refers to the word or words that users enter in a search engine such as Google to find content related to that term. Search terms play a central role in online marketing because the keywords that are crucial for...

Robots.txt

Robots.txt The robots.txt is a text file that is important for indexing website content. With the file, webmasters can specify which of the subpages should be captured and indexed by a crawler such as the Googlebot and which should not. This makes robots.txt extremely...

Return on Investment

Return on Investment The return on investment (ROI) puts the profit in relation to the capital employed. The return on investment is one of the most important key figures in business administration, marketing and controlling. The ROI provides a statement about the...

Retargeting

What is Retargeting Retargeting, often referred to as remarketing, is an online marketing tool. In this process, Internet users who have visited a certain website or clicked on a certain product are addressed with targeted advertisements on their way through the web....

Responsive Content

Responsive Content The term "Responsive Content" describes website content that adapts to the individual characteristics of the website visitor. In this way, users of a website see different content depending, for example, on which device they are using, how often...

Referral-Marketing

Referral-Marketing Referral marketing is a form of recommendation marketing that is actively initiated by companies. Existing customers of a company recommend a product or service, e.g. to their family or friends. The company usually offers an incentive or reward for...

RankBrain

Rank Brain RankBrain is a part of Google's algorithm that is used in processing search queries and determining SERPs. However, RankBrain is not just any piece of the algorithm, but a subsystem that contains the first beginnings of artificial intelligence and can...

Product description

Product description Product descriptions are texts that explain the features and characteristics of products. As a rule, product descriptions are used in web stores and serve to inform potential customers and reinforce their intention to buy. The e-commerce industry...

Plagiarism

Plagiarism If someone else's intellectual property is taken over and passed off as their own work, this is known as plagiarism. This can be, for example, texts, images or melodies. Often, plagiarism is a violation of copyright law, which protects personal intellectual...

Pillar Content

Pillar Content Pillar content is a content marketing strategy that intelligently structures central content and secondary content. The basic content is the actual pillar content, around which topic clusters are grouped. The long-term effect of this approach is the...

Panda (Google Updates)

Google Panda Google's Panda Update describes a series of algorithm changes with which the search engine provider reorders its search results. The Panda Update first appeared in 2011 and, according to Google, is primarily intended to address the quality of content on...

Outbound Marketing

Outbound Marketing Outbound marketing is often referred to as the traditional form of advertising. Here, the advertising company contacts the target group directly and provides them with an advertising message. The difference to inbound marketing lies in the...

Onpage Optimization

Onpage Optimization Onpage optimization (also: onpage optimization) is a part of search engine optimization (SEO). It describes the measures website owners apply to the actual website to make it as findable, usable and readable as possible for search engines and the...

Online Editor

Online Editor An online editor creates content and prepares it for publication on the Internet. This usually involves texts. However, multimedia content such as videos, images and graphics are also an important part of the work of online editors. Online editors can...

Online Marketing

Online Marketing Online marketing is a marketing discipline that, in contrast to classic advertising, is based solely on the medium of the Internet. In online marketing, search engine marketing, banner advertising, e-mail marketing, affiliate marketing and social...

Onepager

Onepager A onepager is a website that consists of a single HTML page. Navigation on the onepage website is usually done by scrolling or jump labels. Onepagers are most suitable as single-topic pages or landing pages for clearly defined campaigns. As far as the design...