18 Mar 2009

How to Detect and Stop Plagiarism

Stopping plagiarism seems to be impossible through protective methods of defence. The only methods available are reactive, but at least there are such methods and you can be kept aware if your articles have been plagiarized. You can then do something about it and have the offending material removed.

Using Javascript

I mentioned defensive methods of protecting your original content from being copied. There are numerous code snippets, largely written in Javascript, that can be inserted into you webpages and thereby disable many browser and mouse functions. Such scripts can stop the highlight and copy commands, they can disable the right mouse button and thereby avoid the context menu popping up and various other features. However, a determined plagiarist will have no problems getting around such basic defences:switching off javascript in the browser being the most obvious. It is also likely to irritate other normal users who may suddenly wonder why they can no longer do such simple navigation tasks such as opening a link in a new tab. The nature of the internet is that as soon as a person lands on your webpage all the information they need will be cached.


Copyscape is a service that protects writers and content buyers from plagiarism. There is a free service that lets you input one of your article web pages and Copyscape searches to see if there are any similar pages in existence. However, the rest of their services require payment. Their Premium service allows searches for blocks of text before online publication as well as a tracking system so you can keep tabs on any copyright infringements that require your response. You can also filter results to avoid flagging up copies that you are posting to article directory websites. As of writing this service costs what seems a low 5 cents per search, but if you have 100 articles that comes to $5, and if you are checking them every month that can soon start to add up.

The alternative is to use their CopySentry services which automatically check for plagiarism on either a daily or weekly basis. This service starts at $4.95 a month for 10 pages plus $0.25 for any additional pages. This seems poor value compared to their standard 5 cents a page but useful if you don't know how to automate their API. One interesting thing is being able to set the sensitivity of the searches and find pages even if they have been slightly edited.

Given the costs involved in using Copyscape I suspect that the service is not so much for writers but for buyers of written content. For example, Constant Content uses Copyscape on their site and this makes sense as it gives buyers confidence that an article they are paying for is genuinely original. A buyer can very quickly check if the article appears elsewhere and they may be less interested in it being copied in the future. As a corporate expense it may be justified but for the freelance writer it seems to me expensive considering there is an almost identical service for free.

Google Alerts

Google Alerts is a new service, still in beta, that will search for a string of characters as it crawls the net and send you an email alert whenever a page is found that includes a matching string. This means you can enter any long unique string from one of your articles and let Google do the same as Copyscape but for free. You can decide on how often to be informed, either once a day, once a week or as it happens. Google will then either send you an email or you can add the alert to a feed and view it in your favourite feed reader.

For the struggling online freelance writer this strikes me as a better option. You won't be able to input the whole text but Google accepts a maximum of 2048 characters in its search queries. This means try to pick fairly long sentences with what seem to be a unique combination of words. Remember to put the whole string in quotation marks otherwise it will search for the occurrences of each word anywhere in the document, rather than the words in the order written. If you start getting too many false flag alerts then you can just go into your Alerts account and edit the string of text. You can also use Google's Advanced Search features but you must then copy the whole query include the advanced tags from the Google search box into your Alerts search.

Watermaking Text

Watermarking of images and documents is a common way to embed a copyright notice that is almost impossible to remove. Indeed, adding a text watermark to an image means that anybody copying it will actually be giving you some free advertising. But watermarking text – plain text rather than a document such as Word – is very difficult. The very nature of plain text means that trying to embed any other kind of code can easily be traced and removed by any determined plagiarist. For example, there are scripts to embed invisible Unicode characters into the blank spaces between words. However, I think most writers want to concentrate on their writing and have a method that is both simple to implement as well as simple to check if and when an article has been copied.

Look at this phrase: iťs оnly a simрle methоd. It looks ordinary enough, but copy and paste it into your wordprocessor. Notice anything strange? If you have an automatic spell checker switched on you will notice that nearly the whole phrase is flagged as incorrectly spelt. It looks perfectly ordinary, can be copied and pasted and still look perfectly ordinary, but some of the characters used are from the extended character set. The would-be plagiarist notices nothing out of the ordinary but you can now use Google Alerts to warn you if this phrase is ever used again.

The advantage of this method over simply using Google Alerts for every article is that you can generate maybe half a dozen key phrases using non-standard characters and not worry about adding a new Alert phrase every time you post a new piece of writing. The key phrases can also be relatively short and common instead of looking for long unique phrases for each article. This obviously won't stop the plagiarist but it will alert you to when it happens.

And Finally

As we've seen trying to lock your precious articles away from thieving plagiarists just doesn't work. The open nature of the internet means that many things, apart from your articles, are open to abuse. There are, however, ways of keeping track of your original content and keeping informed about any acts of plagiarism. Getting the offending content removes requires a number of official letters stating your claim of copyright. These steps will be left for another article.

