A fraud detection tool in E-auctions

Due to rapid growth of the use of online auctions, fraudsters have taken advantage of these platforms to participate in their own auctions in order to raise prices (a practice called shilling). Innocent bidders have been forced to pay higher prices than they were willing to offer. This has resulted in the need to design and implement a shill detection algorithm. To eliminate this shilling problem, we designed a shilling detection algorithm integrated with an online auction. The algorithm proved to be effective and it was tested on the internet, and the short time of shill detection proved that the algorithm can work real time on e-auctions with large user base. This method can be used as a technique to eliminate shilling.


INTRODUCTION
The popularity of online auctioning has grown meteorically since the late 90s.This attraction is due to their convenience, low cost and ability to reach large (even worldwide) audiences (Read et al., 2006).However, despite the advantages there are many problems inherent in auctioning online.Shill bidding is the hardest type of frauds to detect because any user can easily register in an auction system under a false identity to bid on his own selling or buying items, or multiple users can form a group to bid on each other's items under the regulations of online auctions (Read et al., 2006).
A shill is a person who pretends to be a legitimate buyer and feigns enthusiasm for an auctioned item by bidding up the auction price.The role of a shill is typically played by an associate of the seller.In some cases, it can also be played by the seller himself, who poses as a legitimate buyer under a fake online user ID.The ultimate purpose of employing shills is to trick legitimate buyers into paying more than they would if there were no auction frauds (Dong et al., 2009).
While shilling is recognized as a problem (Bhargava et al., 2005), established means of defence against shills did not work in live auctions and did not focus on some important shill bidding behaviours (Chau et al., 2009;Dong et al., 2009), for example, multiple consecutive bidding by the same user, bidding with different identities.The advent of online auctions such as eBay, Amazon and ubid has made shill bidding much more exploitable.This is because it is relatively simple for a seller to register under many aliases and operate in rings with impunity (Read et al., 2006).Furthermore, as bidders are not physically present it becomes much easier for a shill to anonymously influence the bidding process.This study provides answers to the following business question: "How best can we design an algorithm that effectively detects and deters shill bidding with a high level of accuracy?" Hence we examine shill behavior in the online setting and present an algorithm to detect the presence of shill bidders in English auctions.The algorithm examines bidding information across several auctions and produces a score indicating the likelihood that a bidder is engaging in shill behavior.The algorithm is able to prune the search space to detect bidders' likelihood to be shills.This has significant practical and legal implications for commercial online auctions (such as eBay) where shilling is considered a major threat.

RELATED WORK
Trust management in online auction systems Xu (2008) presented a Multi-Agent Trust Management (ATM) framework for online auctions.The shill inference procedure was embedded in the security agent of ATM.(Xu, 2008) introduced a formal model checking approach to detect shilling behaviors, especially the competitive shilling behaviors (Cheng, 2007).Wood statistically analyzed data from rare coin auctions on eBay, and empirically tested the questionable bidding behaviors that are attributable to shill bidding (Wood, 2003).Read (2009) designed an algorithm to detect collusive shill bidding where multiple shill bidders shill in a group.But the two problems of duplicate identity shill bidding and consecutive multiple bidding are not addressed.Moreover, the algorithm does not work in live auctions.(Chau et al., 2009) applied data mining and trust propagation techniques to detect fraudulent users in online auction systems.
Generally, these techniques suffer from two drawbacks.Data mining related approaches need to deal with a large amount of historical data; thus they may have limited value in detecting shill bidding in a time-efficient manner.Pattern matching based and model-checking based approaches do not regularly update prior knowledge with the presence of new evidence.Therefore they may frequently generate false positive results.This paper proposes an approach in which we detect suspicious shilling behavior efficiently, and can also make the results more accurate for online auction by updating the training set on the presence of new evidence.

Dempster-Shafer theory
Information related to decision making was often uncertain and incomplete.Hence it was of vital importance to find a feasible way to make decisions under this uncertainty.D-S theory (Shafer, 1976), a probabilistic reasoning technique, was designed to deal with uncertainty and incompleteness of available information.Dong et al. (2009) proposed a formal approach to verifying shill bidders using D-S theory (Shafer, 1976).
The verification approach utilizes additional evidence, such as various bidding histories and statistics regarding bidder and seller interactions, to verify if an online bidder is a shill.The belief of whether a bidder is a shill is calculated using the D-S theory, which allows the verifier to reason under uncertainty.If the belief of a bidder for being a shill exceeds a certain threshold, the bidder is marked as a shill bidder.
These techniques, however, suffer from being time consuming in their investigation of bidders.Since most bidders do not behave suspiciously, a verifier that processes every bidder will find that most of its execution time is spent on investigating normal bidders.Our approach uses a certain score to send the bidder through a verification process.As such, this work is complementary to other research efforts that precisely verify shill bidders using additional evidence.

Multi-state Bayesian network
This a probabilistic graph model that can be used to capture uncertain knowledge in a natural and efficient way.Goel (2010) used a multi-state Bayesian network to verify detected shill suspects.Similar to the D-S theorybased approach, Bayesian networks are capable of reasoning under uncertainty and can be used to calculate the probability of a bidder being a shill.This technique also suffers from being time consuming in their investigation of bidders.

NetProbe
This uses belief propagation over Markov Random Fields to classify users in online auctions as honest, fraud and accomplices (Shashank, 2007).However, NetProbe misclassify nodes in cases where it flips an honest user to an accomplice or where fraudulent users might easily exploit its assumptions to camouflage themselves.The NetProbe algorithm works under the assumption that fraudsters are connected to accomplices with high probability (0.9) and it connects to other fraudsters or honest people with a very low probability (each having a probability of 0.05).Also accomplices are connected to accomplices with a very low probability (0.1) (Shashank, 2007).Hence the NetProbe algorithm essentially assumes a bipartite graph where fraudsters are disconnected from other fraudsters and honest people while accomplices are disconnected from accomplices.Unfortunately these probabilities of connection are not

METHODOLOGY Strategy
In order to implement this investigation an auction site was designed as a supplement where users will create accounts, post auctions, place bids as well.A database was designed to store auction data in discretized format.This was done to enable us to collect and analyze data for each bidder/user in real time.Data collected from each auction was scrutinized to determine each user's bidding behavior whether it is suspicious or normal.Suspicious bidders are then sent to the verifier for further and final analysis.

Attributes for detecting shill bidding
The investigation identified eleven attributes that are related to shill bidding which we used for classification.These attributes are divided into three categories; auction, stage and user attributes.Table 1 shows attributes in each category:

Elapsed time before first bid (ETFB)
The difference in time between the start of the auction and user's first bid.A bidder with a small ETFB indicates that the user participated late in the auction while a small value may indicate the user's prior knowledge of the auction.Therefore a bidder with a small ETFB is suspicious.

Bidder feedback rating (BFR)
BFR is useful in describing a bidder's experience level and established trustworthiness (Dong et al., 2010).However, there higher chances that shill may collide with other shills to fabricate their ratings, therefore it is not considered as a primary factor for describing the trustworthiness of the user.BFR will be considered in the thorough verification stages.

Remaining time after last bid (RTLB)
This is the difference in time between user's last bid and the auction closing time.Shills always try to avoid winning the auction so they do not participate late in the final stage of the auction.Therefore a higher RTLB, may mean that the user is avoiding winning the auction, hence it is associated with shilling whereas a smaller RTLB shows the bidder's willingness to win the auction.

Affinity for sellers (AS)
Shills usually have a close affinity for a particular seller.A normal bidder may place bids in different sellers' auctions, while a shill tends to participate in a great number auctions conducted by a particular seller who may have collaboration with the shill.The degree of abnormality of a bidder's bid activity is quantified by the percent of participation for a seller's auctions (Dong et al., 2010).A high AS may mean that the seller and the buyer might know each other outside the auction.Therefore, a high AS is more suspicious as compared to a low AS.

Average bid increment (ABI)
ABI refers to the average amount that a bidder outbids the current high bidder during a certain auction stage.For example, if the current high bid is $30.00 and a bidder places a new bid of $40, the bidder's increment is $10.Although a very high value may be due to a bidder's significant interest in the item, this is unlikely for auctioned items that are in high supply.A very high ABI early and middle stages of an auction are highly suspicious as they indicate the bidder's interest in increasing the price of the item.A high ABI during the final stage indicates the user's interest in the item and willingness to win the auction.ABI of a particular stage is calculated as in (Ford, 2013): Equation (1) ABI stage attribute Where Xi is the user's new bid, Yi is the user's previous bid and n is the total number of bids placed by the user in this stage.

Average increment difference (AID)
AID is the average difference between each user's bidding increments.For example if a bidder's previous bid was $10 and places a bid of $15, the user's bidding increment is $5. AID takes the average of computed differences.A substantial positive AID in the early or middle stage could indicate efforts to raise the price of the auction, after seeing initial bidder interest.A negative AID in the early or middle stage, combined with the number of bids (NB) placed close together, may indicate that a suspicious bidder does not want to scare off the currently active bidders and is possibly participating in bid unmasking.It is calculated as in (Ford, 2013): Where Xi is the user's new high bid, Yi is the previous bid of Xi, X0 − Y0 = 0, n is the total NB placed by the user in this stage.We divide the sum by n -1 because there are n − 1 changes of bidding increment for n bids.Note that if n equals 1, AID is set to 0 since a change in bidding increment requires at least two bids placed by the user.

Average time between user bids (ATUB)
ATUB refers to the average time that elapses between two bids placed by the same bidder.A small value of ATUB indicates the bidders is actively participating in the auction by placing bids as soon as he is outbid.On the other hand, a large value of ATUB implies that the bidder is not participating heavily in the auction and is cautious before placing a new bid.A large ATUB value in the early and middle stages of the auction shows the user's to increase the price of the auction or the use of a proxy bidding system.However, a high ATUB during the final stage of the auction shows the bidder's willingness to win the auction.A high ATUB in the early and middle stage combined with a low ATUB in the final stage is suspicious.It is calculated as the inverse of ATUB in the following formula (Ford, 2013): Equation (3) ATUB stage attribute Where Ti is the time of the user's bid and n is the total number of bids in placed by the user in that stage.If n equals 0 or 1 ATUB is set to 0 because the calculation of ATUB requires at least two bids placed by the user.ATUB is used to identify aggressive shill bidders.

Average outbid time (AOT)
AOT is the average time that elapses when a user places a new high bid since another user placed the previous high bid (Ford, 2013).For example, if a bidder placed a bid 20 seconds after another bidder placed a bid, the outbid time would be 20 seconds.AOT is calculated as in (Ford, 2013): Equation ( 4

) AOT stage attribute
Where Ti is the time of the user's bid, Ui is the time of the previous high bid and n is the total NB placed by the user in this stage.A small AOT indicates the user's interest in the auction and possible participation in a bid fight if n is large enough, whereas a large value of AOT typically indicates the user's passing interest in the auction or bidder is evaluating the status of the auction before placing a bid.A very small AOT during the early and middle stages of the auction is suspicious.However a small AOT in the final stage shows the user's interest in the item and willingness to win the auction.

Number of bids (NB)
NB is the number of bids placed by the user in a particular stage within an auction.A high NB in the early stage of the auction shows willingness to raise the price quickly whereas as a high NB in the final stage implies willingness to win the auction.A high NB at the middle stage of the auction might also be suspicious since the bidder might attempt to uncover the true valuation of other bidder.Shills usually place a large number of bids in the beginning and middle stages of the auction in order to increase prices.To avoid winning the auction, shills place a few or no bids in the final stage of the auction.Therefore, a high NB in the early and middle stages combined with no or few bids in the final stage is suspicious.

Algorithm design and analysis
The algorithm makes use of the attributes in Table 1

Early stage
At the end of the early stage, all stage attributes are examined given a normalized value between 0 and 1 for each attribute.From those results a shill score increment is calculated and assumed to be the shill score at the early.This stage will contribute 20% of the shill score because the seller may delay participating in the auction in order to see the level of competition in the auction.For that reason the stage contributes a smaller value to the shill score.

Middle stage
At the end of this stage all stage attributes are analyzed again but this time only with respect to the middle stage time.A shill score increment is calculated again and the shill score is incremented.If the shill score is greater or equal to 0.45 the bidder is detected as shill and the user account is locked and/or an email notification is sent to the user's email address.This stage have the greatest contribution to the shill score because this stage takes about 50% of the auction time.This is the stage where bidders has little chances of winning the auction as compared to the final stage hence shill participate aggressively in the middle stage.

Final stage
At this level the stage attributes are analyzed once more and shill increment is calculated.The bidder's shill score is incremented.This stage assumes a default contribution of 10% towards the shill score because shills usually do not participate or place a few bids in this stage since they try to avoid winning the auction.

Verification stage
Here all auction attributes, social network analysis are examined and the final shill score is calculated.Social network analysis (SNA) is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities (Krebs, 2002).As part our system, we are going to provide a chat service for the users.Users will be able to send and receive private messages.Data from the chat sessions will be analyzed to determine the relationships between users.Social network analysis is conducted at this level by analyzing the level of communication between the user and the seller within the auction.Moreover, Affinity for Seller is also an important factor at this stage; however, we take into account those newbies who never participated in any auction before as suspicious.This stage contributes 20% of the shill score since we try to analyze other attributes that are not directly linked to the user's bidding behavior.

Shill score
A rating between zero and 1 that indicates the likelihood that a bidder has engaged in typical shill behavior based on his/her actions in current and past auctions.Auction results are stored in the site's database.Shill reports can be viewed by the administrator.The administrator decides whether to do nothing, lock shill accounts only, send email notification only, or do both.Reports can be send every hour, every six hours, on shill detection or after twelve hours, or after twenty-four hours (Table 2).platform used to develop an online auction on which the algorithm will be integrated on.WordPress has a lot of plug-ins which allow us to add some functionality easily and provides security as compared to other platforms such as Joomla.WordPress was developed using PHP, hence it can support a MySQL database.

Development tools (1). WordPress: an open source website development
(2).Angularjs: commonly referred to as Angular, is an open-source web application framework, maintained by Google and the community,that assists with creating single-page applications, which consist of one HTML page with CSS and JavaScript on the client side.Its goal is to simplify both development and testing of web applications by providing client-side model-viewcontroller (MVC) capability as well as providing structure for the entire development process, from design through testing.
(3).PHP: is a server scripting language, and is a powerful tool for making dynamic and interactive Web pages quickly.PHP is a widely-used, free, and efficient alternative to competitors such as Microsoft's ASP.(4).MySQL: is a server side programming language for building websites and other web-based applications.
According to Oracle, it is the world's most popular open source database.It enables the cost-effective delivery of reliable, high performance and scalable web-based and embedded applications.

Testing
The algorithm integrated on our auction site was tested against the major project objective.This testing was aimed at ascertaining whether the algorithm was able to meet the initial objective of the project and thus answer the question of whether the problems currently faced by online auctions were addressed.

Objective 1
To design an algorithm that embraces the strengths of the techniques in place but eliminating their weaknesses and loopholes (Figure 1).

Objective 2
To design an auction site that implements our algorithm so as to offer users maximum possible protection from shills (Figure 2).

RESULTS
The functionality of the algorithm was tested in a pool of 20 students who participated on the bidding activity on our auction (CUT Auction hosted on freehosting site), as the algorithm ran integrated in the auction detecting any shills.Of the two auctions posted, the Lexus 570 and Mercedes Benz Auctions, two users were flagged as shills in the Mercedes Benz auction and 1 user was  flagged as a shill in the Lexus 570 auction as shown below:

Mercedes Benz auction
User with ID 28 tafadzwadondo339@gmail.com, User with ID 35 kamedzatawanda@gmail.com were flagged as shills in the Middle_stage attribute scores for eight bidders and Final_stage attribute scores of the auction as shown in Figures 3 and 4.

Lexus 570 auction
User with ID 27 and email marutajacob@gmail.com was flagged as shills in the final stage of the auction as shown  in Figure 5. Early_stage attribute scores (Figure 6), Middle_stage attribute scores (Figure 7) and Final_stage attribute scores (Figure 8).

Evaluation
Evaluation of the algorithm was based on mainly the specified functional requirements.The researchers evaluated each of the functional requirements so as to determine whether they have been met or not.The following are the main functional requirements which were evaluated: (i) Real-time functionality: The algorithm operated in real-time while auctions were progressing and logs were recorded in the background.
(ii) High degree of preventive measures to detected shills: The algorithm actually locked out users who were detected as shills and email notifications where sent to the particular user and also to the Admin.User with ID 27 and email marutajacob@gmail.com was locked out immediately from the auction house in the Lexus LX570 auction.User with ID 28 tafadzwadondo339@gmail.com and User with ID35 kamedzatawanda@gmail.com were locked out immediately in the Mercedes Benz auction.
Below is an example of an e-mail that the admin received after a shill was detected and logged out of User ID 27 and e-mail marutajacob@gmail.com(Figure 9).

How good is our algorithm?
Size: The size of an algorithm is the measure of its complexity.In our algorithm the size is n being the number of auctions.size = n (number of auctions) Order: The order of an algorithm is the measure of the efficiency as a function of the size.The algorithm does not have any nested for-loops that qualifies it to be linear order: O(n) Efficiency The efficiency of an algorithm is a measure of .tsruntime.this is proportional to its number of operations.E is directly proportional to Kn

E=Kn
During the testing of the algorithm 2 auctions were run for 15 mins where three users were detected as shills.We would like to calculate the efficiency on yet a large number of auctions, lets say 10 auctions: E=kn

CONCLUSION AND RECOMMENDATIONS
This paper presented an algorithm to detect fraud in e-auctions.Although many auction websites have taken some actions to avoid shill bidding, there still take place a lot of shill bidding cases from time to time.To find a more efficient approach to discourage shilling is therefore of great value.
To eliminate this shilling problem, the researchers designed a shilling detection algorithm integrated with an online auction.The e-auction fraud detection algorithm proved to be effective as its ability to effectively identify shill bidders was verified on a simulated auction.When tested on the, the short time of shill detection proved that the algorithm can work real time on e-auctions with large user base.
At present possible suitable tests have been done to assess the algorithm's functionality though it does not mean that a flawless algorithm has been developed since there are always undiscovered errors when testing the algorithm.The researchers therefore recommend future improvement in the way which bidding history is analyzed and to improve the accuracy in classifying fraudulent transactions from the legitimate transactions: (i) User friendliness: the tool should have a simple user interface (ii) Real-time: the algorithm should analyze bidding behavior and detect shills while the auction is running (iii) Flexibility: the algorithm should adjust and update to changing bidding behaviors.

Table 2 .
Contribution of each stage towards the shill score.