Blog comment spam… What exactly is it?

How do you know whether you have become an easy target?

How do you stay ahead of the comment spammers?

I decided to put this guide together for two reasons which I will outline below but first let’s look at what blog comment spam is and why it has become such a major issue.

Blog comment spam is sometimes called spamdexing, comment spam, spam in blogs and a number of other variations on the same theme.

The reason why it exists is purely because of the value of links in SEO. One of the most important aspects of gaining good results in the search engine results pages is having a large number of back-links pointing at your website. These are seen as votes for your site by Google and the other search engines and are therefore counted as a powerful factor when allocating positions.

So, obviously the more links you can get the better your site will do (I have used an over-simplification here to avoid a full search engine optimisation explanation – you can read a fairly comprehensive explanation here)

Unfortunately when it comes to working online many people prefer to find ways to circumvent the safeguards which have been put in place to protect the integrity of the search results. By using “black hat techniques” many marketers break most of the compliance regulations in order to avoid the large amount of manual work and the patience required in implementation of “white hat SEO” techniques.

By sending massive amounts of automated spam out across the web with the attitude that some of it will “stick” these cowboys become the nuisance that we all have all grown to seriously dislike…

Google made a major change way back in 2005 to address this issue. By adding a nofollow attribute to links from comments and trackbacks the theory was that this would discourage automated blog comment spam as there would no longer be any SEO benefit from the comments. Google’s post is here:  Preventing Comment Spam.

Unfortunately there are millions of blogs which do not have the nofollow attribute (we voluntarily set our comments as dofollow in order to reward real comments but understand that we need to be more vigilant about spam because of this decision).

Blog spam software launches millions of comments per day at the web and the main problem is with the number of blogs where the owner auto-approves any comment. It is not unusual to find a blog of very low quality which has thousands of spam junk comments per post… Uggh!

Fortunately Google seeks out websites which have a disproportionate number of these low quality comment based links and penalises or even bans the websites which are using these techniques. Frustratingly though, it can take a while for them to identify some of the perpetrators and so ethically SEO’d websites can be relegated below the cowboys for periods of time.

The two reasons why I have created this guide?

#1 We are constantly batting away blog comment spam here – I am sure we will get a large number of spam comments on this post which we will have to identify and delete. If we can help others keep the spammers at bay we at least are making a small contribution to making the web a better place.

#2 I had a very interesting conversation with a local “black hat SEO” a few weeks back. This was after I had completed an evaluation on a website. This evaluation led me to a range of totally non-compliant websites created with masses of duplication and highly dubious linking strategies involving scary comment spam automation.

I will talk about #2 first. When I examined the link profile of one of this guy’s client websites I was pretty horrified by what I found.

Following the back-links to his site I found a large number of sites where he had picked up links. It was like that scene in The Matrix where Keanu Reeves first sees the world as it really is – large ugly robots tending to a crop of human battery cells!  This ugly online world was rampant with robot spam from sites promoting porn, casino, Viagra, fake Rolex watches, pirate products, ponzi schemes and every other dodgy online activity you can think of!

Nice!

Plus links from automatically approved web directories in Russian and Chinese with tens of thousands of spam links and abandoned blogs with comment auto approve set in place and thousands of barely intelligible comments with anchor text phrases which I could not publish here…

Not really the sort of sites any business with integrity would like to be associated with.

This should give you an insight into the ethics of the comment spammer – fast, non-compliant, automated, lazy and unethical!

Now… how to recognise whether you are a comment spam target.

Anyone with a blog is a potential target – and many get suckered into approving automated spam simply because it seems so genuine and  at times, complimentary. If you approve these comments you will get more and more flooding in – guaranteed!

The very first thing you should do is to install Akismet. Akismet has “zapped” more than 40 billion spam comments so far… yes you read that correctly – 40 billion! Akismet took out 7,000 spam comments from our site in only a few months.

The next thing you should do is to add a reverse Turing test to comments – this is the CAPTCHA form which asks a commenter to identify a distorted series of numbers and letters in order to post a comment.

Trivia: CAPTCHA is an acronym for Completely Automated Public Turing test to tell Computers and Humans Apart – hahaha excellent!

You can be certain that some comment spam will still make it past these filters and this is how you can identify it.

Common spam comment formats:

  •  Any short comment that says “Great post. I really enjoyed this” or any other short complimentary submission.
  • Any comment which says “I have bookmarked your site and will return”
  • Any comment which says “I have recommended this site to my brother, work colleagues, followers…”
  • Any comment which says “What sort of website are you using? I really like the design…”
  • Or this one “I disagree with what you said in the second paragraph…” without stating what it actually was
  • Any poorly spelled and grammatically incorrect comment is likely to have been put through a content spinner and the synonyms are clumsy or just plain wrong
  • Any comment with links in it
  • A comment written entirely in another language
  • A nonsense series of letters – Duh!
  • Any comment which talks about how they have been looking for a post exactly like yours but fail to address any point you have spoken about

Basically, if the comment does not mention anything you have written about it will almost definitely be spam. So, put away the ego – those comments praising your incredible writing skills are coming from robots! You could be almost illiterate and still be receiving high praise for your authorship!

Once you have taken these ones out there are a number of more sophisticated spam techniques to look out for.

Here are some of the goodies I have come across:

  • a comment which is a sentence or two taken from your actual post – this is pretty smart as it may be a little while since you wrote it and the comment sounds good enough, you wrote it originally after all!
  • This one I like (Not!) – a comment which is copied from a previous comment! If you have many people commenting on your blog it is going to be difficult to remember all of the previous comments particularly if only a sentence of the original comment is used.

I have had to trawl through to clear out a few of these ones after originally approving them.

And this one is the most sophisticated one I have come across so far.

It escapes all of the filters we have in place.

The software is able to target only search engine optimization related posts.

The comment is intelligently thought out and could easily be a real SEO comment.

Check this out:

Blog comment spam

Click on the image to read the comment.

Why was I suspicious about this comment?

Even though it was well written and seemed to be congruent with what I had written about, I had a niggling feeling that something was not right. My initial thought was that it may be lifted from a search engine optimization blog post somewhere so I thought I would check around to see where else it was published.

To do this I put quotes around a sentence from the comment like this :

“Generally, SEO can be defined as the activity of optimizing Web pages or whole sites in order to make them more search engine-friendly, thus getting higher positions in search results.”

(you can do this to see where any content appears elsewhere online – the quotes mean that only results showing the exact same words in the exact same order will appear in results)

And guess what? This exact phrase match features over 20,000 times on the web!

Don’t tell me that this hard working SEO  has meticulously submitted or published this piece of content over 20,000 times?

Of course not!

This is a very clever version of blog spam and is one that I have not encountered before (or maybe not noticed). The spammer has managed to harvest 20,000 links so far from blogs, forums and from publishing the same piece many times on SEO websites (mainly in India).

With this particular comment a UK company has the link. I am sure that the insurance related business which has hired the Indian SEO company, either directly or through a UK SEO supplier, would be disturbed to think that their name was associated with comment spam.

And this is the thing…

I have seen many self-proclaimed ethical SEO firms outsource a large share of the hard link building graft to cheap SEO companies elsewhere – and this makes sense. But there have to be checks and measures on how these companies are getting the links.

It is perfectly OK to publish a piece of content on multiple sites which you own with a link to a client’s website. This is totally compliant  in SEO and is beneficial to the client. Your content is not going to crowd the search results anyway as I explained in this post on duplicate content problems.

What is not OK is the spamming of blogs and forums using automated means.

If you use the tips I have outlined above your blog will be a lot more interesting as only genuine comments will make it through. A great blog has great comments which discuss aspects of what has been posted. Sometimes it is hard to distinguish the good from the automated but if we all do our best we can do our bit to make the web a less frustrating place.

Have I missed any sneaky blog spam techniques?