University of Houston
Department of Computer Science
In partial fulfillment of the Requirements for the Degree of
Master of Science
Abstract
The goal of data set editing techniques in the instance-based learning is to remove objects from a training set in order to obtain faster and more accurate classifier. In this thesis, we implement several popular data set editing techniques including Wilson editing, citation editing, multi-edit and supervised clustering editing and evaluate the performances on a benchmark consisting of UCI and artificial data sets. All investigated data set editing techniques, with the exception of Multi-edit, perform well on the data sets. The experimental results show that data set editing techniques, in general, improve the accuracy of the instance-based classifiers. Moreover, we also develop a Gabriel graph-based methodology to create data set signatures which identify regions having high class densities in a data set.
Date:
Monday,
November 29, 2005
Time:
11:30AM
Place: 550-PGH
Faculty, students, and the general public are invited.
Thesis Advisor: Dr. Christoph Eick