mFilterIt Blogs

Identifying Click Spam Deterministically

Within the gamut of techniques resorted by fraudsters to ad-fraud, Click Spam is the most common SIVT (Sophisticated Invalid Traffic) method used to spoof the performance. Being the most common technique, 40-50% of the marketing dollars lost due to ad-fraud is eaten up by the fraudsters through Click Spam.

So how do we tackle Click Spam deterministically?

There are two main tests that are carried on any campaign to identify Click Spam and its impact.

i) Click-Install Time Series

ii) Outlier Publishers

i) Click-Install Time Series analysis: In this first basic step the behaviour of click to install is analysed to understand the pattern over a period of time. In any genuine traffic source, the time gap between click and the install cannot be wide. A user will click a source and then install an app. It cannot be that a user views a campaign and installs it later after a considerable gap.

On the contrary, in bogus traffic source, the installs will show abnormal plotting which interprets as users installing apps after an interval once they click a campaign or an advertisement.

Logically, this is never possible. Even if one may argue that the user would have seen the campaign on-the-go and later decided in spare time about installing the app. Or, a scenario while user discovers about an app while surfing for something and later in the evening decides to install the app which was discovered during the day. Yes, all these scenarios are real and can result in abnormal distribution on a time series analysis. But this cannot happen in large volumes. These are very unique and isolated behaviours which cannot be generalised for the masses.

ii) Outlier Publishers: Data can tell almost everything. The Click to Time analysis cannot deterministically ascertain between genuine and fake installs. There are other factors as well to be taken into account, before establishing Click Spam sources. For this, it is essential to identify the outlier publishers.

A baseline analysis is done by studying the click rates of different publishers running a campaign. Logically, the app should target similar users showing more or less same behaviour. This means the publishers should also get same kind of behaviour on their campaigns. A baseline analysis helps to understand the expected genuine clicks / installs on a campaign. Historical data analysis is also useful in establishing a baseline. Once the baseline is established, the click rates achieved by various publishers are plotted. It is understood that the publishers cannot exactly fall on the baseline. Hence, there is a range of tolerance defined using a proprietary algorithm which factors several parameters. If the publisher falls within this range, it is still delivering valid traffic. However, if the publisher is showing performance way beyond this range, it is detected as an outlier, which is definitely resorting to click spam to spoof the performance. There is no magic wand with any publisher to achieve substantially different results than other publishers.

Conclusion: The campaign analysis helps to unambiguously determine the click spam fraud rate and impact. These two tests taken together identify the sources fetching invalid traffic which is a direct dollar loss for the advertiser. Only by blending the analysis of Click to Install Time with identification of an Outlier Publisher, mFilterIt deterministically pin-points the fake sources resorting to Click Spam to fake performance and get paid for non-performance tricking the advertisers.

Let’s engage in a detailed conversation on Click Spam ad-fraud technique and how its impacting brands bleeding their marketing dollars. Connect with me by writing to