Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

11,905 Hits in 3.3 sec

The Impact of Small Disjuncts on Classifier Learning [chapter]

Gary M. Weiss
2009 Annals of Information Systems  
Many classifier induction systems express the induced classifier in terms of a disjunctive description. Small disjuncts are those disjuncts that classify few training examples.  ...  This analysis provides many insights into why some data sets are difficult to learn from and also provides a better understanding of classifier learning in general.  ...  these data sets to assess the impact that small disjuncts have on learning.  ... 
doi:10.1007/978-1-4419-1280-0_9 fatcat:3uooy4k4pjawxl4ud25cs7eey4

Learning with Rare Cases and Small Disjuncts [chapter]

Gary M. Weiss
1995 Machine Learning Proceedings 1995  
Acknowledgements I would like to thank Andrea Danyluk and Rob Holte for their comments on an earlier version of this paper, and Foster Provost, Brian Davison, and Rosalie DiSimone-Weiss for comments on  ...  the current version.  ...  Finally, this paper will assess the impact that error prone rare cases and small disjuncts have on learning (i.e., on error rate).  ... 
doi:10.1016/b978-1-55860-377-6.50075-x dblp:conf/icml/Weiss95 fatcat:c6edwguklverhetoobz24yvt3i

Foundations of Imbalanced Learning [chapter]

Gary M. Weiss
2013 Imbalanced Learning  
This chapter begins by describing what is meant by imbalanced data, and by showing the effects of such data on learning.  ...  Because this learning task is quite challenging, there has been a tremendous amount of research on this topic over the past fifteen years.  ...  Figure 2 . 5 25 The impact of absolute rarity on classifier performance Having a small amount of training data will generally have a much larger impact on the classification of the minority-class (i.e.  ... 
doi:10.1002/9781118646106.ch2 fatcat:opqe7dy2onaadp2ckacz6bdaxq

Mining with rarity

Gary M. Weiss
2004 SIGKDD Explorations  
These descriptions utilize examples from existing research, so that this article provides a good survey of the literature on rarity in data mining.  ...  Rare objects are often of great interest and great value. Until recently, however, rarity has not received much attention in the context of data mining.  ...  on the impact of small disjuncts and class distribution on data mining.  ... 
doi:10.1145/1007730.1007734 fatcat:sb2tk62wrffifcirx75kw22etq

Learning with Class Skews and Small Disjuncts [chapter]

Ronaldo C. Prati, Gustavo E. A. P. A. Batista, Maria Carolina Monard
2004 Lecture Notes in Computer Science  
One of the main objectives of a Machine Learning -MLsystem is to induce a classifier that minimizes classification errors.  ...  In this sense, this work analyzes two important issues that might influence the performance of ML systems: class imbalance and errorprone small disjuncts.  ...  We wish to thank the anonymous reviewers for their helpful comments. This research was partially supported by the Brazilian Research Councils CAPES and FAPESP.  ... 
doi:10.1007/978-3-540-28645-5_30 fatcat:g5quowqhnzczbclglwhjgz44mm

Enhanced Classification to Counter the Problem of Cluster Disjuncts

Syed Ziaur Rahman, Dr.G. Samuel Vara Prasad Raju
2014 International Journal of Computer Trends and Technology  
This paper presets a rigorous yet practical model dubbed as Cluster Disjunct Minority Oversampling Technique (CDMOTE) for learning from skewed training data.  ...  The empirical study suggests that CDMOTE have been believed to be effective in addressing the class imbalance problem.  ...  Moreover, since classifiers attempt to learn both majority and minority a concept, the problem of small disjuncts is not only restricted to the minority concept.  ... 
doi:10.14445/22312803/ijctt-v18p148 fatcat:nvzqkn54z5fa5hmzyeswjefvi4

An Evolutionary Algorithm for Automated Discovery of Small-Disjunct Rules

Basheer M.Al-Maqaleh, Mohammed A. Al-Dohbai, Hamid Shahbazkia
2012 International Journal of Computer Applications  
The proposed algorithm is validated on several datasets of UCI data set repository and the experimental results are presented to demonstrate the effectiveness of the proposed scheme for automated small-disjunct  ...  In the context of data mining, small disjuncts are rules covering a small number of examples. Due to their nature, small disjuncts are error prone.  ...  Weiss and Hirsh present a quantitative measure for evaluating the effect of small disjuncts on learning [18] .  ... 
doi:10.5120/5547-7615 fatcat:zz4ecdvslbgedef6y7qh2gv3dy

A genetic-algorithm for discovering small-disjunct rules in data mining

Deborah R. Carvalho, Alex A. Freitas
2002 Applied Soft Computing  
At first glance, this is not a serious problem, since the impact on predictive accuracy should be small.  ...  However, although each small disjunct covers few examples, the set of all small disjuncts can cover a large number of examples.  ...  The authors reported experiments with a number of data sets to assess the impact of small disjuncts on learning, especially when factors such as training set size, pruning strategy, and noise level are  ... 
doi:10.1016/s1568-4946(02)00031-5 fatcat:av7ifipllng7jbbt7ppdfo7vvu


Nitesh V. Chawla, Nathalie Japkowicz, Aleksander Kotcz
2004 SIGKDD Explorations  
ACKNOWLEDGEMENTS We thank the reviewers for their useful and timely comments on the papers submitted to this Issue.  ...  We would also like to thank the participants and attendees of the previous workshops for the enlightening presentations and discussions.  ...  The (often) negative impact of class imbalance is compounded by the problem of small disjuncts, particularly in small and complex data sets.  ... 
doi:10.1145/1007730.1007733 fatcat:tdpfkg6vgbgqrpjtclkhrz5rne

Evaluating Six Candidate Solutions for the Small-Disjunct Problem and Choosing the Best Solution via Meta-Learning

Deborah R. Carvalho, Alex A. Freitas
2005 Artificial Intelligence Review  
This paper offers two main contributions to the research on small disjuncts. First, it investigates 6 candidate solutions (algorithms) for the problem of small disjuncts.  ...  A set of classification rules can be considered as a disjunction of rules, where each rule is a disjunct. A small disjunct is a rule covering a small number of examples.  ...  Wesley Romao for having prepared the CNPq data sets for data mining purposes, allowing us to use those data sets in our experiments.  ... 
doi:10.1007/s10462-005-1586-7 fatcat:2vx4upzavzd5pbp42o3n4zrr2i

A hybrid decision tree/genetic algorithm method for data mining

Deborah R. Carvalho, Alex A. Freitas
2004 Information Sciences  
The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows.  ...  problem of small disjuncts.  ...  The authors reported experiments with a number of data sets to assess the impact of small disjuncts on learning, especially, when factors such as training set size, pruning strategy, and noise level are  ... 
doi:10.1016/j.ins.2003.03.013 fatcat:p46zw62zg5frdpskh4mzu6poya

A hybrid decision tree/genetic algorithm method for data mining

2004 Information Sciences  
The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows.  ...  problem of small disjuncts.  ...  The authors reported experiments with a number of data sets to assess the impact of small disjuncts on learning, especially, when factors such as training set size, pruning strategy, and noise level are  ... 
doi:10.1016/s0020-0255(03)00414-6 fatcat:tocgzy4swjfvpg3c7gmje4bz74

Overlapping, Rare Examples and Class Decomposition in Learning Classifiers from Imbalanced Data [chapter]

Jerzy Stefanowski
2013 Smart Innovation, Systems and Technologies  
Class imbalance constitutes a difficulty for most algorithms learning classifiers as they are biased toward the majority classes.  ...  The novel observation is showing the impact of rare examples from the minority class located inside the majority class.  ...  Small disjuncts are these parts of the learned classifier which cover a too small number of examples [20, 62] .  ... 
doi:10.1007/978-3-642-28699-5_11 fatcat:42bihx3adndylpj742o2pwkhg4

Addressing the Classification with Imbalanced Data: Open Problems and New Challenges on Class Distribution [chapter]

A. Fernández, S. García, F. Herrera
2011 Lecture Notes in Computer Science  
help to follow new paths that can lead to the improvement of current models, namely size of the dataset, small disjuncts, the overlapping between the classes and the data fracture between training and  ...  help to follow new paths that can lead to the improvement of current models, namely size of the dataset, small disjuncts, the overlapping between the classes and the data fracture between training and  ...  Acknowledgment This work had been supported by the Spanish Ministry of Science and Technology under Project TIN2008-06681-C06-01.  ... 
doi:10.1007/978-3-642-21219-2_1 fatcat:ni4ri4aaavf4rbwjtmvxqary4u

Bayes Imbalance Impact Index: A Measure of Class Imbalanced Dataset for Classification Problem [article]

Yang Lu, Yiu-ming Cheung, Yuan Yan Tang
2019 arXiv   pre-print
In fact, other data factors, such as small disjuncts, noises and overlapping, also play the roles in tandem with imbalance ratio, which makes the problem difficult.  ...  In this paper, we focus on Bayes optimal classifier and study the influence of class imbalance from a theoretical perspective.  ...  loss from small disjuncts.  ... 
arXiv:1901.10173v1 fatcat:qpnrtlbt4zg6dlmrhohos3q67y
« Previous Showing results 1 — 15 out of 11,905 results