A new algorithm for knowledge discovery from data sets using cross-entropy measurement

Ouml;mer AKGOuml;BEK

doi:10.5897/SRE11.971

Scientific Research and Essays

Abbreviation: Sci. Res. Essays
Language: English
ISSN: 1992-2248
DOI: 10.5897/SRE
Start Year: 2006
Published Articles: 2768

A new algorithm for knowledge discovery from data sets using cross-entropy measurement

Ömer AKGÖBEK

Department of Industrial Engineering, Engineering Faculty, Zirve University, 27260, Gaziantep, Turkey.
Email: [email protected]

Article Number - 419D73F39898
Vol.6(20), pp. 4301-4311 , September 2011
https://doi.org/10.5897/SRE11.971

Accepted: 16 August 2011
Published: 19 September 2011

Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0.

Abstract

This study suggests a new method for selecting attributes in algorithms used for generating rules for data mining. The most common measure resorted for selection of attribute is entropy. Entropy is defined as a measure of uncertainty. According to this, the entropy of a system is higher as the uncertainty in the system. Usually the entropy is used to measure uncertainty of C4.5, CN2, CART etc. Attributes in data mining and the cross-entropy is not used frequently. Therefore a new algorithm named REX-1C is derived from REX-1 algorithm that uses entropy in order to test effects of cross-entropy on the learning phenomenon (by using accuracy and rule number). Twenty data sets of different specifications and sizes which are commonly used in the machine learning field and sampled from real life were chosen to test the success of said algorithm. Using those data sets, effects of norms on accuracy of the algorithm and number of rules it produces were calculated and results were compared to Rules-3 Plus, Rules-6, REX-1 and C5.0 algorithms. According to the results achieved, it was observed that REX-1C algorithm produced better results compared to Rules-3 Plus, Rules-6, REX-1 and C5.0 algorithms in respect to accuracy.

Key words: Data mining, entropy, cross-entropy, classification, rule extraction.

This article is published under the terms of the Creative Commons Attribution License 4.0

Back to Vol. 6 No. 20

Back to articles

Views: 0
Downloads: 0

Related Articles:
On Google
On Google Scholar

Articles on Google by: