If you’ve ever run your own blog you may have noticed that the Internet is a cesspool of evil-doers who’ve set up their own WordPress blog to run around on other blogs’ RSS feeds and leach new posts to drop onto their own site. They write no original content of their own and oftentimes they don’t even pick which feeds to suck up as there are not only WordPress plugins that leach and publish, they hunt down RSS feeds themselves that appear to be relevant to whatever topics the scumbag selects. This is known as autoblogging, blog scraping and splogging. It’s that prevalent that there are actually multiple terms for the practice.
There are some autoblogging plugins, if you can believe it, and they are actually hosted on the main WordPress repository and not some shady torrent site, that go into each post it leaches using some sophisticated thesaurus-like tricks to reword and rephrase segments and even somehow link tags of each post so that the actual author has a more difficult time trying to Google down his leached content. And more often than not they hotlink the accompanying article images, meaning when someone views their blog it causes that browser to hit our server, draining a bit of bandwidth each time and adding to our CPU load. Both bandwidth, as you may have noticed if you’ve tried to load our website, and CPU load are very finite resources for us. Each little hit matters; it’s not entirely just a point of pride.
What’s even more frustrating is that these guys and their ISPs are about as responsive to requests to cease the theft as those door close buttons in elevators are (close, damnit!). You’re lucky if you can leverage the guy to remove your site from his list, a hollow victory as the guy will keep on doing his thing to a bunch of other blogs and just tell you Hey, relax guy, thought you’d be cool with it. Among their other active plugins, which undoubtedly and invariably include AdSense widgets, are advanced search engine optimization plugins (SEO) that examine and sift through the autoblogging blog’s unoriginal content to compile, dynamically, the best set of meta keywords and descriptions in order to get ranked higher on Google.
In short, they make money by registering a cheap domain name, installing WordPress (if their ISP didn’t already install it) and a couple plugins, leave the thing on autopilot and pull in money without any work beyond that and signing up for an AdSense account. Google could be an ally to the likes of us here as no one’s better at being able to detect this activity than them, Google downranking them accordingly, but Google has no way of knowing that we don’t have an arrangement with these guys so that ain’t happening. But WordPress hosting the plugins that facilitate this? Could someone please tell me what possible redeeming value there may be to a plugin that has the following in its description to warrant its redistribution through official channels? Look (but don’t download please) at this one which has had over 150,000 downloads so far. Just look at it. Blows my mind:
We stumble upon sites that do this to us routinely and it makes us mad and dizzy. Maybe our panties wad easily but it’s just infuriating. Even though it’s a pointless game of whack-a-mole, trying to stop each one, we try anyway and the real cost is not so much that visitors that would otherwise be visitors of our site (possibly) go to some other guy’s site and click his ads, but that we spend, collectively, a bunch of hours on each pursuit, time and energy we’d otherwise spend on writing more original content which should be our top priority. And then we’ll go back and forth with each other on, for example, installing an RSS fingerprint plugin to detect these guys automatically, whether to do a server trick to send out some sort of nasty image expressing our disapproval to the sites that hotlink our images, and then another one of us will note that that might screw up some of our RSS or Google Reader views, then another will say well no, not if you add this to .htaccess or apache2.conf or we should try this cloaking thing or … argh. A simple cost/benefit analysis on this, in my estimation, is to let it slide in almost every instance but we’re human and get emotional.
I was writing this just as a rant, to vent, but hey, if you’re ever in the mood, maybe Google a fragment of a recent post, maybe one posted six or seven slots down, see if you find anyone and if you happen to want to be a lawyer when you grow up, and then if you spot a content thief of either our content or another blog’s you frequent, see if you can write an intimidating or otherwise persuasive letter to make them stop, either to whoever you can figure out to be the blog’s contact either on the blog or the whois data or the ISP or both. Actually you know what, don’t do that, it’s a waste of time – and please don’t tip us off either as we’ll just go crazy whack-a-moling. Just keep reading our site instead and we’ll try to keep you entertained, informed and fully digested of mobility stuff or whatever would belong in our mission statement were we to write one. And to those of you reading this post anywhere other than our actual site, RSS or Google Reader or whatever, please come on over to MobilityDigest.com as, though the content may be similar to what you’re reading now, we worked harder on our theme than your guy did.
That’s it I’m done.