page 1  (16 pages)
2to next section

Discretization of Continuous-Valued Attributes and

Instance-Based Learning

Ting, Kai Ming

Basser Department of Computer Science

University of Sydney, NSW 2006, Australia

E-mail: [email protected]

Technical Report No.491

October, 1994

Abstract

Recent work on discretization of continuous-valued attributes in learning decision trees has produced some positive results. This paper adopts the idea of discretization of continuous-valued attributes and applies it to instance-based learning (Aha, 1990; Aha, Kibler & Albert, 1991).

Our experiments have shown that instance-based learning (IBL) usually performs well in continuous-valued attribute domains and poorly in nominal attribute domains. Cost and Salzberg (1993) have devised the modified value-difference metric (MVDM) that raises the performance of IBL in nominal attribute domains.

This paper explores a way in which continuous-valued attributes and nominal attributes can be treated cohesively in IBL. An algorithm which combines the discretization of continuous-valued attributes and IB1 (Aha, Kibler & Albert, 1991) using the modified value-difference metric is introduced. The empirical results show that the proposed algorithm, IB1-MVDM* achieves a substantial improvement over C4.5 (Quinlan, 1993), IB1 and IB1-MVDM in most of the domains tested. A performance comparison is also made with a naive Bayesian learner (Cestnik, 1990).

Keywords: Instance-based learning, Bayesian learning.