Google Penguin Update – is the newest Google ranking algorithm update! Google’s newest Google Penguin update is aimed at nuking spam content or those websites employing spammy tactics… What really is spam or spamming anyway? What are considered spammy tactics?
As we already know, Google gave a few examples of what it considered nonsensical spun content and keyword stuffing. This stuff is obvious and quite frankly, if the examples they gave were actually not marked as spam before, they should be embarrassed.
One thing is for sure with Google’s Penguin update, there has been some serious backfires. Even in Google’s own blog post there are lots of very respected people in the SEO industry that are raising their voices in concern Apparently Google and white hat SEO proponents had two different definitions of spam.
I had been using Gmail myself and as usual every time I opened my Gmail account I am bombarded by lots of spam emails in my spam folder. A light bulb went on. Well, when we take a closer look at Gmail, which is run by Google, we can get clues as to how Google identifies and classifies spam. Why not investigate Gmail? I found some pretty interesting information by learning how Google identifies spam in Gmail which demonstrates how they are identifying spam in the content of websites.
First and foremost, I was not aware of this before, if you open a spam email you can see that Google puts a little warning message telling you why the email was marked as spam. Please see below:
So I decided to learn more by clicking through the link that directs me to this Google support page which explains a bit more about how Google identifies spam emails. As you can read on the Google support page, it explains the following reasons why an email is marked as a spam:
Yes, the first reason that an email would be marked as spam is because of phishing. It is no surprise as Google protect its users and doesn’t want users to get duped into giving up financial or personal information to scammers!
Lesson: Google doesn’t like websites with malicious software (malware) or perhaps non-trusted merchants therefore Google is not a fan of any website that tries to infect or cause viruses that can be harmful to its visitors.
Messages from an Unconfirmed Sender
Google’s second reason has something to deal with messages from an unconfirmed sender. This basically happens when someone pretends to send you emails from what appears to be an official website address of a trusted provider but they aren’t really related with that website.
Lesson: This must have something to do with hacked or hijacked websites, websites not registered to Google Webmaster Tools, or other websites with suspicious ownership details.
Messages You send to your Spam Box
The very next reason that an email would be marked as spam is because you previously marked it as a spam email such as persistent email messages from the same users or identical subjects.
Lesson: In the Google Chrome browser you can block sites from your search results. Also websites with high bounce rates and little engagement would probably qualify in this category.
This deals with similarity to suspicious email messages. Google says here “Gmail uses automated spam detection systems to analyse patterns and predict what types of messages are fraudulent or potentially harmful.” Google then list some examples of spam email messages, such as typical spam languages (get rich quick, adult, etc.) email messages from IP addresses or accounts that previously had been sending spam messages or suspicious attachments.
Lesson: Here’s where things get a little tricky. How does Google determine what is usually associated with spam? Of course, we know the obvious ones. But what if a legitimate website in a legitimate industry was suddenly viewed or marked as spam?
‘Community Clicks’ –this is the name of the first method Google point out in combating spam. As more and more users mark content as spam, then Google uses that data to determine what messages are likely to be spam. We all have been trying to figure out what all those sites that got hit by Google Penguin Update had in common. What can be seen in all of their contents? We can see their links –external and internal. What is the one thing we cannot see on those websites? We cannot see their usage statistics. Google can!
Google even writes this on their battle against Gmail spam emails: “Our team of leading spam-fighting scientists uses a number of advanced Google technologies. Though in many cases our best weapon is you.” Can it be more revealing than this? So in this revelation, Gmail admits that their best asset in dealing with spam email messages is their user’s feedback.
Meanwhile, think about this, why are there so many new websites in each of the major keyword SERPs? Why are there so many websites that have never been there before? Simply because Google has no data about them, yet! Yes, they have never have been in the SERP,s but now that they are Google will quickly realise (through this new Google Penguin update) whether or not those results are good or if they are spam.
In addition, Gmail says that they can filter email messages by their content and look for languages that are typically associated with spam emails. For instance, I’ve noticed that anything that is related to insurance, loans and pharmaceuticals all dump to my spam folder. What does this tell us? Gmail knows the languages that are commonly associated with those products are likely to be spam!
So what about websites? Wouldn’t this knowledge of identifying and classifying spam be shared with the search and web spam teams? Do you think Matt Cutts has access to Gmail’s spam detection data? I am totally convinced he does and he can use the same data on this new Google Penguin update!
No doubt, Gmail is one of Google’s pretty well documented services. You can read and understand quite a bit on spam filters and how they work. I totally recommend this to everyone, as this provides us with a better idea on how Google identifies spam content!