Discretization of Continuous-Valued Attributes and
Ting, Kai Ming
Basser Department of Computer Science
University of Sydney, NSW 2006, Australia
E-mail: [email protected]
Technical Report No.491
Recent work on discretization of continuous-valued attributes in learning decision trees has produced some positive results. This paper adopts the idea of discretization of continuous-valued attributes and applies it to instance-based learning (Aha, 1990; Aha, Kibler & Albert, 1991).
Our experiments have shown that instance-based learning (IBL) usually performs well in continuous-valued attribute domains and poorly in nominal attribute domains. Cost and Salzberg (1993) have devised the modified value-difference metric (MVDM) that raises the performance of IBL in nominal attribute domains.
This paper explores a way in which continuous-valued attributes and nominal attributes can be treated cohesively in IBL. An algorithm which combines the discretization of continuous-valued attributes and IB1 (Aha, Kibler & Albert, 1991) using the modified value-difference metric is introduced. The empirical results show that the proposed algorithm, IB1-MVDM* achieves a substantial improvement over C4.5 (Quinlan, 1993), IB1 and IB1-MVDM in most of the domains tested. A performance comparison is also made with a naive Bayesian learner (Cestnik, 1990).
Keywords: Instance-based learning, Bayesian learning.