Fighting fraud with fractions: New algorithm weeds out fake product reviews automatically

michelle bachmann fraudulent amazon product review

How often do you check the comments and number of stars a product has received prior to purchasing a service or product?

If you’re like most of us, the answer is nearly always. On the Web, positive or negative product reviews will often be the deal breaker — or maker — for potential customers. Which also makes faking them big business. To combat the growing market for fraudulent user reviews, researchers at the University of Illinois at Chicago with the backing of Google have developed an algorithm, dubbed GSRank, for weeding out fraudulent reviews.

Novice fraudulent reviews are easy to catch — reviewers will create a new account specifically to vilify or praise a single product. More complex operations use multiple accounts working in tandem to review multiple products. Yet this collaborative effort is what allows the algorithm to identify the fraudulent reviewers.

Looking at groups of reviews turns out to be the key. When a competitor intends to slander a rival, it’s likely that the fraudulent reviews will come from a group of multiple accounts, called “spammer groups.” The same grouping patterns appear in fraudulent “good” reviews meant to bolster the profile and desirability of a company’s own product.

Drilling down further, the researchers found eight signals for false reviews, which serve as the foundation for the algorithm that ultimately flags the fraudulent reviews. They are:

1. Group Time Window – A spam group will tend to publish its reviews within a short window of time.

2. Group Deviation – Reviews that deviate significantly from the others on a product may have a higher probability of being fake. 

3. Group Content Similarity – Whether you’d call it lazy or not, spammers will copy reviews from within their own spam group. After all, there isn’t much money to be made in a single review, and spending time on a single review is money that could have been made elsewhere.

4. Group Member Content Similarity – Like Group Content Similarity, a single reviewer may save time by reusing, with slight edits, a review for a similar product.

5. Group Early Time Frame – The earliest reviews will tend to have the biggest effect on swaying the decisions of potential customers. Accordingly, companies are inclined to hire spammers early in the product’s existence, so fraudulent reviews will typically be among the first to review the product or service.

6. Group Size Ratio – The size of a spam group with respect to the total number of reviewers will also be a sign of spammers.

7. Group Size – The larger the spam group’s size, the more damaging the reviews will be. The chances that the reviews will be submitted at the same time will also be slimmer. Larger spam groups are more inclined to intentionally time the review’s post, or if reviewers in spam groups have been selected individually, a post will be unlikely to be published at the same time.

8. Group Support Count – A group of potentially fraudulent accounts is far more likely to have reviewed the same products than a group of genuine accounts.

So far, the research, mining and manual detection of false reviews has been a laborious and expensive process. With the introduction of the researcher’s GSRank algorithm, the process can be mostly automated. Companies trying to work the system now have a formidable hurdle.