Panda, facts and myths
Changes to the algorithm of 11 April 2011 surprised webmasters and have generated much of theories about what Google wants to penalize effectively.
Facts
Google wants to remove from the visible part of the index pages with poor content
Poor content means a text or image with few independent backlinks actually.
A site is penalized as a whole if some pages are considered of poor content
This has been confirmed officially. The penalty can be mild or severe depending on the proportion of low-content pages.
Thus pages that were top of the results, may lose several positions and appear behind links to pages of lower interest or at least beside the point, that because the site is penalized.
The penalty is almost irreversible
It is not pages that are penalized, but a negative score assigned to the whole site. Once its traffic lost, the site can not recover because it depends on the number of backlinks and with less traffic it gets even less backlinks.
To recover, add new pages with rich content to improve the score of the site, and also improve the poor content pages which have no backlinks. Objectively is "poor content" what makes a site pandalized. (See article on the Panda algorithm).
Things are different when the site has duplicate content, because it triggers a different penalty.
Google knows that it unfairly penalizes sites
Google recommends: "combine shallow pages to make a more useful content. " We do not combine spam to make useful content, the advice is directed to genuine webmasters.
Panda has one purpose: to fight spam. To prevent spammers to progress by trial and error, it penalizes the entire site making it difficult to know which part is covered and it does not negate the penalty when the content that is considered poor is deleted.
By penalizing an entire site for a part of the content, it also knows that it may penalizes quality content.
A site could be penalized when its contents is copied
A side effect is that sites whose content is often copied, have been affected by this change, Google being often unable to distinguish what is original and what is the copy. This has shocked webmasters.
Normally, the algorithm must penalize sites that mirror the content from other sites. But often it confuses the original and the copy and the first is taken for the second.
This happened even at popular sites like cultofmac.com.
A site can be penalized for an earlier reason
A site may have lost much of its traffic at the Panda Update for a cause that has nothing to do with quality content, as confirmed by Matt Cutts.
The site had already received a negative signal, for example, have placed links to a link farm or used text link ads, and received a negative score with no effect on traffic. The effect occurs on April 11 when this score was combined with other unfavorable signals.
Panda is not a change in the algorithm but a different program
It is a program which is run at regular intervals and consider the sites on a different basis, trying especially to determine the usefulness of the pages, their interest to the user.
But Google recently (at SMX in june 2014) said it is using more than 500 algorithms for ranking the pages, so the concept of a general algorithm loses its meaning.
Google filed two patents related to Panda
Patent 8,190,537 requested October 31, 2008 describes how, based on the characteristics of a number of pages, by recognizing these features in other pages, we can rank them with the previous ones. This is what Panda does. By finding the characteristics of sites without useful content in a new site it deduces that it has no useful content.
Si of f the pages of a site are similar to those of sites without any meaningful content it will be penalized even if its content is useful.
It goes without saying that if all pages of a site are made on the same model, it is more likely to be affected by Panda.
Patent 8 682 962 filed March 25, 2014 by the same engineer behind the Panda algorithm describes how for a group of resource a global score is defined and applied to the pages of the group, modifying the score of the main algorithm.
Panda has reduced spam but not improved results
Panda has effectively removed a part of the spam, only a part, because we always see the first results pages occupied by commercial that does not necessarily correspond to what is desired. For example the query "free software" often leads to "paid software with free trial."
In general, results were not improved and is still diffcult to find what we search when it does not correspond to just a few keywords.
Many small sites have been penalized while they offer unique content. This was partially repaired with the May, 20 2014 update.
Myths
Myth: Panda evaluates the quality of a site based on its content
This myth has been deliberately maintained by Google, Panda is an algorithm that penalizes pages with poor or shallow content. This implies that the engine scans the pages to assess their content, which is totally false.
In fact, Panda compares the number of completely independent links to the pages of the sites, with the number of queries displaying these pages, and defines a ratio. The lower is the ratio and the more the site is penalized. (This is described in the Patent 8,682,892).
It is a fact that the method favors sites that have more genuine backlinks and to have them you need an original and interesting content. But this this indirect effect may depend on various causes. One can get a lot of links in a promotion activity to a banal content. This is more popularity that quality that is taken into account.
Myth: This is a group of Google employees who defined the quality of a site
The Panda algorithm analyzes a site based on criteria defined by a group of employees who were presented a set of sites and decided which are quality, which are not.
The sites are then penalized when they deviate from this standard preset, regardless of other criteria.
This study led to define a formula that establishes a vague correlation between a score and quality but nothing in the algorithm is intended to analyze the quality of pages.
Myth: A site is penalized because it displays too much advertising
The number of ads on a page is not taken into account by the algorithm of the search engine. Besides the service Adsense of Google permit to display 6 units on one page.
Google does not penalize a site because it displays too many ads, never ... Matt Cutts confirmed at PubCon 2011 that ads are not a direct ranking factor in Panda.
A page may be penalized because it has too little content (perhaps a single sentence) beside a filling almost exclusive of advertising. There is a difference between 3 ads that cover 90% of a page, above the fold or not, and 3 ads that cover 10% of a page.
Myth: Panda is made against some content farms
Several needs have been combined in this new algorithm. Perhaps the firm was annoyed by the arrogance of companies like Demand Media we saw too often on the media (but which are no longer heard since Panda), and the joke about the yacht named Adsense, but the update has a more general and long term purpose.
This has affected 14% of sites in English, so millions.
It is likely that content farms have served as sampling as several versions of Panda have been launched until they have all been caught.
The future of search engines is not compatible with pages without original content. One can expect that Google reduces the sources of information, or substitute itself to them.
Myth: A Gmail account can penalize a site
If you publish a newsletter via Gmail and if a significant number of those who subscribed does not open the messages when there are received or report a spam, the site of the newsletter receives a penalty.
But this is vehemently denied by Matt Cutts: a request for reconsideration is not limited to what the webmaster said it may cancel a penalty for something different!
Cult Of Mac once pandalyzed for its content copied, got a white list entry