- rants.org - https://rants.org -

Spam Insidy.

There’s a particularly insidious kind of comment spam nowadays, one that cannot be defeated by automated measures such as captcha [1]. I’ve been noticing more and more of it on QuestionCopyright.org [2], and I’m sure other web site admins are experiencing the same wave.

I don’t know if this phenomenon has a name yet; perhaps you can tell me.

Basically, it’s comment spam written by people — real human beings, not robots — who are paid to surf the web. They scan each article as quickly as they can and then leave a “drive-by” comment. The comment is usually on-topic, more or less, but of extremely low quality, and contains a commercial link back to whoever’s paying that person to surf. This is now a business model: there are intermediaries who hook up people willing to leave links for money with companies looking to boost their search engine rankings. The intermediary charges a flat rate per comment (US $0.20 seems to be the going rate [3]), keeping a percentage and paying the rest to the surfer. The customers buy in bulk, of course, and the surfers are paid in bulk; the intermediary’s business model is based on economies of scale and smoothing things out.

Such comments now comprise the vast majority of new comments on QuestionCopyright.org. That is, we still get genuine comments at the same rate we did before — in fact, that rate may even be slowly increasing — but it’s dwarfed by the number of paid-link spams we’re getting now. It’s a total deluge. To a first approximation, all our new comments are spams, and then if you look closely there are a few hams (good comments) scattered randomly in the flood.

No Turing Test [4] can possibly solve this, because actual humans are involved. Making your captcha puzzles harder won’t help: all it does is drive up the price-per-comment a bit for the buyers, while also making it harder for legitimate commenters to leave their remarks. The only way to detect such comments is to have human editors reading and making judgements.

This has profound implications for the user interfaces by which editors filter out spam. But before we get to that, let me show you how insidious the problem is. Here are some examples, all taken recently from the same article [5] on QuestionCopyright.org. (Note that this is just the tip of the iceberg — the same thing is happening on all the articles on the site.)

This article is very
Submitted by Anonymous on Thu, 2008-06-19 12:53.

This article is very useful.I read it carefully and I agree with the main idea of the author.The opportunities that the internet offers to make our life and work simpler should be taken advantage of. It is absolutely necessary in the field of copyright.

Internet marketing [6]

That one was a pretty easy call. Even the link text (“Internet marketing”) practically screams spam.

But how about this next one?

Extremely Informative
Submitted by Sedona [7] on Mon, 2008-06-09 11:32

Thank you for this extremely informative article. I agree I don’t feel its about creativity but the publishing entities sustaining a mood of “go cautiously” and keep a big legal war chest.

Thank you,
Sedona [8]

A little harder to tell, that time. The comment doesn’t exactly say anything, but it’s not immediately clear that it’s nonsense — you have to read it somewhat carefully to figure that out. Sedona is actually a registered user of the site (note that her name is highlighted in the header line), and the link text at the bottom is just the name “Sedona” too. But the link points to www.sedona-spiritual-vacations.com.

It turned out that this comment was also pretty clearly spam: “Sedona” left similar comments elsewhere on the site, always with the same commercial link at the bottom. None of her comments said anything much, let alone responded in a meaningful way to the content of the article.

But it gets worse. Some link spam comments actually say something. The paid surfer reads the article, apparently enjoys it and has some kind of non-trivial thought about it, and leaves a halfway decent (or sometimes even better than that) comment — but still with a paid link. Like this:

Patent and CopyRight
Submitted by Anonymous on Thu, 2008-05-22 15:41

Apart from the middle man and distributors its probably Lawyers who benefit the most from these **laws**.

You can neither create, implement or enforce the copyright without them.

One only has to look at the case of RIM ( Blackberry ) in Canada who was forced to pay 600 Million to what was in essence a group of 30+ lawyers who pro bono backed a patent that was actually overthrown in court ( but not before RIM was told to pay ).

This is not an isolated example where a claim jumper has been given a ridiculous patent by the patent office ( who frequently revokes them after they are challenged )

My uncle, who is somewhat of an economist, likes to say that lawyers are one of the very few professional groups who do not contribute to the gross national product of a country.

I am not against lawyers, they are a useful bunch. But like many government employees ( which they are not ) when allowed, too many of them actively attempt to overvalue their services within the scope or measurement of a country’s forward economic progress.

Signed, A Poker Lover [9]

Not a great comment, I admit, but not completely pointless either. If there had been no commercial link, nothing else about it would have raised my suspicions. It was followed up to some days later by this one:

Authors & Artists
Submitted by Anonymous on Sun, 2008-06-08 14:14

I agree very much with the previous poster. How many lawsuits are brought up about copyright a year. The millions of dollars which are thrown at law companies around the world to up hold a ‘companies’ intellectual rights is pathetic to say the least.

The only person who truely has a right to claim stack is the writer, producer artist. I qould pay my way to anyone who does work for me. If they provide a service like my electric or water company I pay them. But what do these middle men companies do? They look out for themselve and only themselve. It is time the power was taken away from the big corperations and given back to the people who really deserve to be paid. Those who created it in the first place.

Regards,

David of PC Sport Live [10]

Wow, the link spammers are following up to each other’s posts! Actually, it’s possible that the “David” of the second post is the same person who wrote the first post, even though he (or she?) portrays himself as being a different person. I’ve noticed they do that a lot. You can often tell, from a combination of the writing style and the link destination, that supposedly distinct commenters are really the same person.

The next day, someone followed up to “David”‘s comment:

Copyright laws
Submitted by Anonymous on Mon, 2008-06-09 16:34

Hello Everyone,

Today, I note that RedHat Founder Bob Young also weighed in on the copyright issue :

A new open source software group has added its voice to the opposition against the Conservative government’s ( Canada )impending copyright reform bill. Lulu CEO Bob Young likens the legislation to banning screwdrivers because they could be used by burglars.

……

Young said the proposed bill will cater too heavily to the content industry and not to the engineers and software developers that are going to be most severely impacted by the new laws. The proposed anti-circumvention legislation, he said, is similar to making the use and ownership of screw-drivers and pliers illegal because they can be used to commit crimes such as burglary.

Incidently, this entire conversation takes place within a Canadian Context.

Young further says,

“The copyright philosophy behind the U.S. DMCA is that it’s illegal to do what software engineers do every day of the week and what they’ll have to continue to do in order to build better technology for all companies,” Bob Young, spokesperson for the Canadian Software Innovation Alliance (CSIA) and a former founder and CEO at Red Hat Inc., said. “The biggest concern is we’re going to have law substitute for good technology. We’re crafting these laws without having anyone from the technology industry engaged in the process.”

The complete article is here itworldCanada.com [11]

An interested internet marketing guy [12]

Hmmm. It’s clumsily written, and consists mostly of quotes from someone else, but there’s real content there: that quote about the screwdriver is terrific. I have to admit that the comment actually contributes something to the site. I think it probably lies somewhere between typical paid link spam and a real comment: it might be from a person who is actually associated in some permanent way with the business being linked to, and who just makes a habit of always signing his posts with a link back to his business. Or it might be the usual kind of paid link spam. I frankly can’t tell.

I could go on and on; the above is a tiny fraction of what we’ve been getting on the site. There are obvious spams, semi-spams, maybe-not-spams, clearly-not-spams, and every gradation in between. I sometimes have to exercise real judgement when doing comment moderation; it’s not always clear what’s spam and what’s not.

In fact, it is no longer possible to divide comments into “spam” and “not spam” in an unambiguous, binary way. A given comment can now fall into both categories. Paid-link spammers are humans, and may have genuine reactions to the articles they read, even though most of the time they’re reading primarily to get just enough of a sense of the topic to be able to write a drive-by comment. Editors will just have to deal with comments on a case-by-case basis. It may be possible to apply some automatable heuristics, but they will always be imperfect, because the problem of categorization has become arbitrarily complex.

This phenomenon has implications for both site editors and software designers. For the former:

The implications for software designers (particularly of content-management systems such as Drupal [14], which is what we’re running on QuestionCopyright.org) are equally important:

Any other ideas, folks? It’s a whole new world out there…