Automatic detection of click fraud in online advertisements
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Increasing advancement, access and availability of the Internet Technology have intensified the growth of the Internet users over the last decade. This has made online advertising a popular venue for many companies to market their products and services. Today, online advertisement is one of the most important sources of revenues that impact the economy of many large enterprises. In online advertisement, an advertiser pays a broker (e.g., Google, Yahoo), who normally has a search engine, to post its online advertisement, which can be on any appropriate publisher site. The publisher earns revenues from the broker for each click on the advertisement posted on its site, while the advertiser will be charged. Thus, when an excessive number of clicks occur, this can quickly dry up the fund of a rival company and drive it out of the competing advertisement. At the same time, each click adds revenue to the publisher. This motivates click frauds, which refer to malicious acts to create fraudulent clicks with the intent to increase the revenue or drive away competitors without real interest in the products or services being advertised. Identifying click frauds is a difficult problem because of the dynamic nature of the click behaviors, some of which are generated by humans and some are by automated software called bots. There have been previous work attempting to identify click frauds using various techniques but they tend to be limited by the types of the data, the way they are processing or assumptions that are not always achievable. This thesis presents an approach to automatically detecting click frauds in online advertising. The approach uses a mathematical theory of evidence to estimate the likelihood of a click whether it is a fraud or genuine using web log data of user’s activities on the advertiser’s website. One advantage of the proposed approach is the fact that the likelihood can be computed for each incoming click and thus, it gives an online computation of the belief that fits well with the dynamic behaviors of users. The thesis describes the approach and evaluates its validity using two real-world case studies. We believe the approach is general in that it can be applied to any scenario.